[PATCHv2] gdb/unwinders: better support for $pc not saved

  In V2 I've addressed the test failure that Keith Pointed out.  The
only changes between V1 and V2 are in the testsuite, the GDB side of
things is unchanged.

The problem is, I suspect caused by an S390 compiler bug, though I'm
not 100% sure, maybe I'm just missing something.  If there is a bug
then it's a weird little issue related to debug information
generation, not a functionality bug.

The test as originally written created a Python unwinder which claimed
any frame for the function break_bt_here.  The frame-id the unwinder
creates for the break_bt_here frame is just $pc and $sp value unwound
from the next frame.

This of course is a bad frame-id.  A frame-id should remain the same
for every address within a function, which is why GDB usually uses the
$pc for the start of a function.

Still, for the sake of this test it didn't really matter.  Or so I
thought.

The failure Keith pointed out only happened when we had a stack like
this:

  main -> break_bt_here -> other_func

My Python unwinder would claim the break_bt_here frame and create a
frame-id by unwinding the $pc and $sp values from `other_func`.

As other_func is a trivial (empty) leaf function, not every
architecture is going to need to allocate a stack frame for this
function.  For example, on S390 the return address is placed into r14
and not onto the stack.

As such, my expectation, for a stack grows down architecture, is that
the $sp value in other_func would be equal to, or less than the $sp
value in break_bt_here.

In my experience the DWARF Call Frame Address (CFA) is usually the
stack pointer value on entry to a function, so my expectation was that
the CFA of other_func would be equal to, or less than the unwound $sp
value in break_bt_here, so (I thought) I'd be fine to use the unwound
$sp value as the frame-id stack address in break_bt_here.

However, on S390, the DWARF CIA that covers other_func looks like this:

  00000000 0000000000000014 00000000 CIE
    Version:               1
    Augmentation:          "zR"
    Code alignment factor: 1
    Data alignment factor: -8
    Return address column: 14
    Augmentation data:     1b
    DW_CFA_def_cfa: r15 ofs 160
    DW_CFA_undefined: r14
    DW_CFA_nop

While the FDE for other_func is:

  00000070 0000000000000018 00000048 FDE cie=0000002c pc=00000000010011a0..00000000010011b0
    DW_CFA_advance_loc: 4 to 00000000010011a4
    DW_CFA_register: r11 in r16 (f0)
    DW_CFA_advance_loc: 4 to 00000000010011a8
    DW_CFA_def_cfa_register: r11
    DW_CFA_advance_loc: 6 to 00000000010011ae
    DW_CFA_restore: r11
    DW_CFA_def_cfa_register: r15

Notice the 'DW_CFA_def_cfa: r15 ofs 160' in the CIE.  This sets up a
default offset of 160 bytes for the CFA.

I don't believe there's anything really "wrong" with this, but it does
seem weird.  It means that the stack-address within the frame-id for
other_func will be 160 bytes higher than the $sp value on entry to the
function other_func.

Now this isn't normally a problem as break_bt_here, when using the
DWARF unwinder rather than my custom Python unwinder will also have a
160 byte offset, so the frame-id for break_bt_here will still appear
to be for an earlier stack frame.

However, my custom Python unwinder doesn't understand this "magic" 160
byte offset, and instead just uses the unwound stack value.

Which means that when I have a stack like this:

  main -> break_bt_here -> other_func

The stack address in the frame-id for break_bt_here is actually less
than the stack address in the frame-id for other_func.  This then
triggers frame_id_inner check within get_prev_frame_always_1 and GDB
things something has gone wrong with the stack unwind, leading to the
failure Keith reported.

My solution to this problem is to run up to break_bt_here without the
Python unwinder loaded.  Use 'maint print frame-id' to capture the
frame-id generated from the DWARF unwinder, and then restart GDB and
load the Python unwinder.

I can then tell the Python unwinder the contents of the DWARF generate
frame-id, and the Python unwinder will report this as its frame-id.
For x86-64 very little changes.  But for S390 I now use a stack
address that includes the "magic" 160 byte offset, and everything is
fine.

---

This started with a Red Hat bug report which can be seen here:

  https://bugzilla.redhat.com/show_bug.cgi?id=1850710

The problem reported here was using GDB on GNU/Linux for S390, the
user stepped into JIT generated code.  As they enter the JIT code GDB
would report 'PC not saved', and this same message would be reported
after each step/stepi.

Additionally, the user had 'set disassemble-next-line on', and once
they entered the JIT code this output was not displayed, nor were any
'display' directives displayed.

The user is not making use of the JIT plugin API to provide debug
information.  But that's OK, they aren't expecting any source level
debug here, they are happy to use 'stepi', but the missing 'display'
directives are a problem, as is the constant 'PC not saved' (error)
message.

What is happening here is that as GDB is failing to find any debug
information for the JIT generated code, it is falling back on to the
S390 prologue unwinder to try and unwind frame #0.  Unfortunately,
without being able to identify the function boundaries, the S390
prologue scanner can't help much, in fact, it doesn't even suggest an
arbitrary previous $pc value (some targets that use a link-register
will, by default, assume the link-register contains the previous $pc),
instead the S390 will just say, "sorry, I have no previous $pc value".

The result of this is that when GDB tries to find frame #1 we end
throwing an error from frame_unwind_pc (the 'PC not saved' error).
This error is not caught anywhere except at the top-level interpreter
loop, and so we end up skipping all the 'display' directive handling.

While thinking about this, I wondered, could I trigger the same error
using the Python Unwinder API?  What happens if a Python unwinder
claims a frame, but then fails to provide a previous $pc value?

Turns out that exactly the same thing happens, which is great, as that
means we now have a way to reproduce this bug on any target.  And so
the test included with this patch does just this.  I have a Python
unwinder that claims a frame, but doesn't provide any previous
register values.

I then do two tests, first I stop in the claimed frame (i.e. frame #0
is the frame that can't be unwound), I perform a few steps, and check
the backtrace.  And second, I stop in a child of the problem
frame (i.e. frame #1 is the frame that can't be unwound), and from
here I check the backtrace.

While all this is going on I have a 'display' directive in place, and
each time GDB stops I check that the display directive triggers.

Additionally, when checking the backtrace, I am checking that the
backtrace finishes with the message 'Backtrace stopped: frame did not
save the PC'.

As for the fix I chose to add a call to frame_unwind_pc directly to
get_prev_frame_always_1.  Calling frame_unwind_pc will cache the
unwound $pc value, so this doesn't add much additional work as
immediately after the new frame_unwind_pc call, we call
get_prev_frame_maybe_check_cycle, which actually generates the
previous frame, which will always (I think) require a call to
frame_unwind_pc anyway.

The reason for adding the frame_unwind_pc call into
get_prev_frame_always_1, is that if the frame_unwind_pc call fails we
want to set the frames 'stop_reason', and get_prev_frame_always_1
seems to be the place where this is done, so I wanted to keep the new
stop_reason setting code next to all the existing stop_reason setting
code.

Additionally, once we enter get_prev_frame_maybe_check_cycle we
actually create the previous frame, then, if it turns out that the
previous frame can't be created we need to remove the frame .. this
seemed more complex than just making the check in
get_prev_frame_always_1.

With this fix in place the original S390 bug is fixed, and also the
test added in this commit, that uses the Python API, is also fixed.

Reviewed-By: Kevin Buettner <kevinb@redhat.com>
---
 gdb/frame.c                             |  32 +++++++
 gdb/testsuite/gdb.base/pc-not-saved.c   |  48 ++++++++++
 gdb/testsuite/gdb.base/pc-not-saved.exp | 113 ++++++++++++++++++++++++
 gdb/testsuite/gdb.base/pc-not-saved.py  |  71 +++++++++++++++
 4 files changed, 264 insertions(+)
 create mode 100644 gdb/testsuite/gdb.base/pc-not-saved.c
 create mode 100644 gdb/testsuite/gdb.base/pc-not-saved.exp
 create mode 100644 gdb/testsuite/gdb.base/pc-not-saved.py

base-commit: 05d1b4b4ad7d74a64cc71c53d621241fc393fcb6

Message ID	2137e3edccdc997be04f3235718d772b5be97727.1706885900.git.aburgess@redhat.com
State	New
Headers	Return-Path: <gdb-patches-bounces+patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 14B09385803F for <patchwork@sourceware.org>; Fri, 2 Feb 2024 15:20:36 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id EE4393858403 for <gdb-patches@sourceware.org>; Fri, 2 Feb 2024 15:20:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EE4393858403 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EE4393858403 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706887209; cv=none; b=iXpVxD8qbX4Su4P0/TAqAs/wAX9pIoUyj7c3tF33wTQXDb+76l18s4WN/jpAOz5PnkTgvIP+Upvs97SlEoSBWxKHiW8TtGUwjKNb1TXXq6HJakDZJju4frLzUL1kRoSZelL1bkcEP2GXU1CQ+lcAtqlDW5/0EBHOIVva5JeqNck= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706887209; c=relaxed/simple; bh=WDdmgdaIhpovhtP/4mvMUEJwER9PGEjOLjLep1jLWZI=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=phzJqODS4nJZmiiiD/0Igqc/DzrsDNUenEK8/gCAE9DAiBhofNdwvtXzxaWVicsslX0jNW7WoUY4178b+UKDSPziBNbwLauDJcOth8bpME0aUbqC7lL858TyAp8eg30zO5aauKJce+z1B73w9yldnPg4o4qQ+uqYQIAnyRPQkgA= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706887205; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZwIZppqAZhNzr3JHknIFgqVaWtMopWEW5oCGCAX9xm8=; b=gUXcnhClgU0AeYQHTmOViisn6eBrWgy8XIbUWrbIasP6CAAfv8T1PvWiMEUXPv6fRqExS5 5dlpaEiyMQdeCpjqZ9Q3QMszua/tzKZSOt/RyvqvHzu7zEHaZtVBPGgSrj3ORuGDBWBLaD CAJiKIvtodWQGfNm/UP8U2Ek63x5QgU= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-100-T6Pf52gWM0moqXNypH68wg-1; Fri, 02 Feb 2024 10:20:04 -0500 X-MC-Unique: T6Pf52gWM0moqXNypH68wg-1 Received: by mail-wr1-f69.google.com with SMTP id ffacd0b85a97d-33ae4fd9c3cso854289f8f.3 for <gdb-patches@sourceware.org>; Fri, 02 Feb 2024 07:20:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706887202; x=1707492002; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZwIZppqAZhNzr3JHknIFgqVaWtMopWEW5oCGCAX9xm8=; b=gFO3yiUz8Vn7patNCEEmAZaOaqAyuMBtcG3af/9pq4906RouDXCC4t/ixDKMrNPfcc iGMfWI6erX26QJwIff7I27i8XeIZtonnkRyKjabNrELmpph+QAIzFAIguYMu2CjXONYV +q1NG1kZgT1rIn6q3gG1KIC7z78mVy6qnSGitT0PQ4vhH6nzfhxuduGZC2kWfNx6w254 aec0t0bUr8gPTxVbGtaZvbBqT6fCLNbHq2AKRK+Zdzjm5vHZnjlk1SrU7RNDTUV6mNqv s1ctlhduuqTDbCUkn7rcxjiivLjSO3PLASUK4u6OqitHhBoFKJPkDvnflDNwXtYh+uqC e6WA== X-Gm-Message-State: AOJu0Yyj1HiFrPEHNyVXfvODSfSTsoWdKHNE8FbhQ0apAhjw72hXk0ca Byr+0iSt/6w0qjxe8j7A4Ig4299ZegK0P8W5zXbVW4ZKNW3uO125gQuH0/sFbBM7KwjCySIhd8h 2iiObz1Y2nASppv2GoUM7kKeCi3z2iMZV3OElRIEAatlAqKEgi+rohI6lb6gVkwe1yuKzYxOOmR Ax9s1iFsr0oqF1eLqAQISbsAJ7m/xEzxQ7PGRH6t6Vmvc= X-Received: by 2002:adf:e512:0:b0:33a:f523:8267 with SMTP id j18-20020adfe512000000b0033af5238267mr3696272wrm.47.1706887201983; Fri, 02 Feb 2024 07:20:01 -0800 (PST) X-Google-Smtp-Source: AGHT+IFevEt6xiRMOSDpwfUoJJtnzWWuzVKkQRNVg7GLkHvEQ6bQehQkU3IWyXyw/xnaqgDzx1jWsA== X-Received: by 2002:adf:e512:0:b0:33a:f523:8267 with SMTP id j18-20020adfe512000000b0033af5238267mr3696254wrm.47.1706887201304; Fri, 02 Feb 2024 07:20:01 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCVhuMf1EClJbatb5wJScOgv7eUOt+T243TGEkvBkVFI46BMJcRO1U3UZRGWY7N1yPOM1Aq2LX5i3lucyiL1FBMG5CqvWdz9AhZa7Lp49SkhTqLAiRQ= Received: from localhost (185.223.159.143.dyn.plus.net. [143.159.223.185]) by smtp.gmail.com with ESMTPSA id p15-20020a056000018f00b0033addbf2d2csm2158424wrx.9.2024.02.02.07.20.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Feb 2024 07:20:00 -0800 (PST) From: Andrew Burgess <aburgess@redhat.com> To: gdb-patches@sourceware.org Cc: Andrew Burgess <aburgess@redhat.com>, Keith Seitz <keiths@redhat.com>, Kevin Buettner <kevinb@redhat.com> Subject: [PATCHv2] gdb/unwinders: better support for $pc not saved Date: Fri, 2 Feb 2024 15:19:51 +0000 Message-Id: <2137e3edccdc997be04f3235718d772b5be97727.1706885900.git.aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <4d3be44a23e2e5853d966c081bccbeb751004310.1706366387.git.aburgess@redhat.com> References: <4d3be44a23e2e5853d966c081bccbeb751004310.1706366387.git.aburgess@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>, <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>, <mailto:gdb-patches-request@sourceware.org?subject=subscribe> Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org
Series	[PATCHv2] gdb/unwinders: better support for $pc not saved \| [PATCHv2] gdb/unwinders: better support for $pc not saved

Context	Check	Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64	success	Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm	success	Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64	success	Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm	success	Testing passed

[PATCHv2] gdb/unwinders: better support for $pc not saved

Checks

Commit Message

Comments

Patch