binutils/dwarf: Print embedded source, when available
Checks
| Context |
Check |
Description |
| linaro-tcwg-bot/tcwg_binutils_build--master-arm |
success
|
Build passed
|
| linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 |
success
|
Build passed
|
| linaro-tcwg-bot/tcwg_binutils_check--master-arm |
success
|
Test passed
|
| linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 |
success
|
Test passed
|
Commit Message
When an object file with debugging information has the source embedded
(via DW_LNCT_LLVM_source), the `--dwarf=rawline` output from objdump
prints the source but the format leaves something to be desired.
This small patch makes it so that the source is printed started on a
new line in the File Name Table when the user invokes `objdump` with
`--dwarf=rawline`.
The DW_LNCT_LLVM_source line number header entry format is slated to
be added to DWARFv6: https://dwarfstd.org/issues/180201.1.html and
lldb supports it (and gdb will soon support it).
If this is a feature that seems worthwhile, I would be more than happy
to add tests. I just didn't want to waste anyone's time with a long(ish)
patch at first.
I tried to follow all the proper coding style requirements, but I am
sure that there is something that I missed! Sorry in advance!
---
binutils/dwarf.c | 32 ++++++++++++++++++++++++++------
include/dwarf2.h | 1 +
2 files changed, 27 insertions(+), 6 deletions(-)
Comments
On 16.04.2026 16:21, Will Hawkins wrote:
> When an object file with debugging information has the source embedded
> (via DW_LNCT_LLVM_source), the `--dwarf=rawline` output from objdump
> prints the source but the format leaves something to be desired.
>
> This small patch makes it so that the source is printed started on a
> new line in the File Name Table when the user invokes `objdump` with
> `--dwarf=rawline`.
>
> The DW_LNCT_LLVM_source line number header entry format is slated to
> be added to DWARFv6: https://dwarfstd.org/issues/180201.1.html and
> lldb supports it (and gdb will soon support it).
>
> If this is a feature that seems worthwhile, I would be more than happy
> to add tests. I just didn't want to waste anyone's time with a long(ish)
> patch at first.
Please go ahead with adding some testing.
> I tried to follow all the proper coding style requirements, but I am
> sure that there is something that I missed! Sorry in advance!
Well, first, and unless you have a copyright assignment in place with the
FSF, you need to sign-off on your patch.
> --- a/binutils/dwarf.c
> +++ b/binutils/dwarf.c
> @@ -5776,7 +5776,7 @@ display_formatted_table (unsigned char *data,
> {
> unsigned char *format_start, format_count, *format, formati;
> uint64_t data_count, datai;
> - unsigned int namepass, last_entry = 0;
> + unsigned int namepass, namesourcepass, last_entry = 0;
The new variable doesn't need scope wider than the loop its used in. Then
questions towards the change to the loop itself will also be easier to
answer.
> @@ -5847,6 +5847,9 @@ display_formatted_table (unsigned char *data,
> case DW_LNCT_MD5:
> printf (_("\tMD5\t\t\t"));
> break;
> + case DW_LNCT_LLVM_source:
> + // Skip source ... display on next line.
Please follow GNU comment style (/* Skip source; display on next line. */).
> @@ -5872,13 +5876,29 @@ display_formatted_table (unsigned char *data,
>
> READ_ULEB (content_type, format, end);
> READ_ULEB (form, format, end);
> - bool do_loc = (content_type == DW_LNCT_path) != (namepass == 1);
> +
> + bool do_loc = (content_type == DW_LNCT_path)
> + != (namesourcepass == 1);
> + do_loc |= (content_type == DW_LNCT_LLVM_source)
> + != (namesourcepass == 2);
> +
> + char delimiter = '\t';
> +
> + /* Print Source last (if available) and print it
> + starting on the next line. */
> + if (namesourcepass == 2 && content_type == DW_LNCT_LLVM_source)
DW_LNCT_LLVM_source being in the DW_LNCT_{lo,hi}_user range, how do you
know here that 0x2001 actually means DW_LNCT_LLVM_source?
Jan
Thank you for the quick reply! See inline!
On Thu, Apr 16, 2026 at 10:47 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 16.04.2026 16:21, Will Hawkins wrote:
> > When an object file with debugging information has the source embedded
> > (via DW_LNCT_LLVM_source), the `--dwarf=rawline` output from objdump
> > prints the source but the format leaves something to be desired.
> >
> > This small patch makes it so that the source is printed started on a
> > new line in the File Name Table when the user invokes `objdump` with
> > `--dwarf=rawline`.
> >
> > The DW_LNCT_LLVM_source line number header entry format is slated to
> > be added to DWARFv6: https://dwarfstd.org/issues/180201.1.html and
> > lldb supports it (and gdb will soon support it).
> >
> > If this is a feature that seems worthwhile, I would be more than happy
> > to add tests. I just didn't want to waste anyone's time with a long(ish)
> > patch at first.
>
> Please go ahead with adding some testing.
You got it!
>
> > I tried to follow all the proper coding style requirements, but I am
> > sure that there is something that I missed! Sorry in advance!
>
> Well, first, and unless you have a copyright assignment in place with the
> FSF, you need to sign-off on your patch.
>
Will do!
> > --- a/binutils/dwarf.c
> > +++ b/binutils/dwarf.c
> > @@ -5776,7 +5776,7 @@ display_formatted_table (unsigned char *data,
> > {
> > unsigned char *format_start, format_count, *format, formati;
> > uint64_t data_count, datai;
> > - unsigned int namepass, last_entry = 0;
> > + unsigned int namepass, namesourcepass, last_entry = 0;
>
> The new variable doesn't need scope wider than the loop its used in. Then
> questions towards the change to the loop itself will also be easier to
> answer.
Nuts. Missed that! Absolutely will change!
>
> > @@ -5847,6 +5847,9 @@ display_formatted_table (unsigned char *data,
> > case DW_LNCT_MD5:
> > printf (_("\tMD5\t\t\t"));
> > break;
> > + case DW_LNCT_LLVM_source:
> > + // Skip source ... display on next line.
>
> Please follow GNU comment style (/* Skip source; display on next line. */).
Sorry!! I have no idea how that one slipped through!
>
> > @@ -5872,13 +5876,29 @@ display_formatted_table (unsigned char *data,
> >
> > READ_ULEB (content_type, format, end);
> > READ_ULEB (form, format, end);
> > - bool do_loc = (content_type == DW_LNCT_path) != (namepass == 1);
> > +
> > + bool do_loc = (content_type == DW_LNCT_path)
> > + != (namesourcepass == 1);
> > + do_loc |= (content_type == DW_LNCT_LLVM_source)
> > + != (namesourcepass == 2);
> > +
> > + char delimiter = '\t';
> > +
> > + /* Print Source last (if available) and print it
> > + starting on the next line. */
> > + if (namesourcepass == 2 && content_type == DW_LNCT_LLVM_source)
>
> DW_LNCT_LLVM_source being in the DW_LNCT_{lo,hi}_user range, how do you
> know here that 0x2001 actually means DW_LNCT_LLVM_source?
>
The literal comes from
https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/BinaryFormat/Dwarf.def#L1213
I chose to use that one because clang is the most popular compiler
that implements this feature and I was hoping for as much
compatibility as possible.
However, its value in v6 will be 0x6
(https://dwarfstd.org/issues/180201.1.html).
We could consider using both? I have been working with this feature in
lldb and would be more than happy to submit a patch there that changes
it to use 0x6. Of course, there will likely need to be some support
for backwards compatibility (assuming that there are binaries in the
wild using 0x2001).
I would be more than happy to do whatever you think is best!
Thank you again for the review! I will prepare a v2 ASAP!
Will
> Jan
On 16.04.2026 16:56, Will Hawkins wrote:
> On Thu, Apr 16, 2026 at 10:47 AM Jan Beulich <jbeulich@suse.com> wrote:
>> On 16.04.2026 16:21, Will Hawkins wrote:
>>> @@ -5872,13 +5876,29 @@ display_formatted_table (unsigned char *data,
>>>
>>> READ_ULEB (content_type, format, end);
>>> READ_ULEB (form, format, end);
>>> - bool do_loc = (content_type == DW_LNCT_path) != (namepass == 1);
>>> +
>>> + bool do_loc = (content_type == DW_LNCT_path)
>>> + != (namesourcepass == 1);
>>> + do_loc |= (content_type == DW_LNCT_LLVM_source)
>>> + != (namesourcepass == 2);
>>> +
>>> + char delimiter = '\t';
>>> +
>>> + /* Print Source last (if available) and print it
>>> + starting on the next line. */
>>> + if (namesourcepass == 2 && content_type == DW_LNCT_LLVM_source)
>>
>> DW_LNCT_LLVM_source being in the DW_LNCT_{lo,hi}_user range, how do you
>> know here that 0x2001 actually means DW_LNCT_LLVM_source?
>>
>
> The literal comes from
> https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/BinaryFormat/Dwarf.def#L1213
>
> I chose to use that one because clang is the most popular compiler
> that implements this feature and I was hoping for as much
> compatibility as possible.
>
> However, its value in v6 will be 0x6
> (https://dwarfstd.org/issues/180201.1.html).
And its name will be DW_LNCT_source. No LLVM in there. And no need for
disambiguation of the value (once Dwarf6 is finalized).
> We could consider using both? I have been working with this feature in
> lldb and would be more than happy to submit a patch there that changes
> it to use 0x6. Of course, there will likely need to be some support
> for backwards compatibility (assuming that there are binaries in the
> wild using 0x2001).
Sure, yet that won't eliminate the need to disambiguate LLVM's use of
0x2001 from any one else's.
Jan
>>>>> Will Hawkins <hawkinsw@obs.cr> writes:
> The DW_LNCT_LLVM_source line number header entry format is slated to
> be added to DWARFv6: https://dwarfstd.org/issues/180201.1.html and
> lldb supports it (and gdb will soon support it).
> include/dwarf2.h | 1 +
FWIW changes to this file have to be propagated to the gcc repo. There
isn't a problem here but it might be easier if this were a separate
commit.
> diff --git a/include/dwarf2.h b/include/dwarf2.h
> index bf9287f1..ee546a40 100644
> --- a/include/dwarf2.h
> +++ b/include/dwarf2.h
> @@ -294,6 +294,7 @@ enum dwarf_line_number_content_type
> DW_LNCT_size = 0x4,
> DW_LNCT_MD5 = 0x5,
> DW_LNCT_lo_user = 0x2000,
> + DW_LNCT_LLVM_source = 0x2001,
> DW_LNCT_hi_user = 0x3fff
> };
Tom
>>>>> "Jan" == Jan Beulich <jbeulich@suse.com> writes:
>> We could consider using both? I have been working with this feature in
>> lldb and would be more than happy to submit a patch there that changes
>> it to use 0x6. Of course, there will likely need to be some support
>> for backwards compatibility (assuming that there are binaries in the
>> wild using 0x2001).
Jan> Sure, yet that won't eliminate the need to disambiguate LLVM's use of
Jan> 0x2001 from any one else's.
This probably isn't an issue here considering that there aren't any
other extensions in dwarf_line_number_content_type yet. But anyway the
approach taken here is basically what's already done in many other
cases. And while I think there have been some clashes, normally this
isn't a problem because the various open projects coordinate a bit.
Supporting both in-development and final published values is also fairly
common.
thanks,
Tom
On Thu, Apr 16, 2026 at 11:35 AM Tom Tromey <tom@tromey.com> wrote:
>
> >>>>> "Jan" == Jan Beulich <jbeulich@suse.com> writes:
>
> >> We could consider using both? I have been working with this feature in
> >> lldb and would be more than happy to submit a patch there that changes
> >> it to use 0x6. Of course, there will likely need to be some support
> >> for backwards compatibility (assuming that there are binaries in the
> >> wild using 0x2001).
>
> Jan> Sure, yet that won't eliminate the need to disambiguate LLVM's use of
> Jan> 0x2001 from any one else's.
>
> This probably isn't an issue here considering that there aren't any
> other extensions in dwarf_line_number_content_type yet. But anyway the
> approach taken here is basically what's already done in many other
> cases. And while I think there have been some clashes, normally this
> isn't a problem because the various open projects coordinate a bit.
> Supporting both in-development and final published values is also fairly
> common.
>
> thanks,
> Tom
Thank you both for being open to adding this little nicety! A new
version of the patch is on the way!
Sincerely,
Will
@@ -5776,7 +5776,7 @@ display_formatted_table (unsigned char *data,
{
unsigned char *format_start, format_count, *format, formati;
uint64_t data_count, datai;
- unsigned int namepass, last_entry = 0;
+ unsigned int namepass, namesourcepass, last_entry = 0;
const char * table_name = is_dir ? N_("Directory Table") : N_("File Name Table");
SAFE_BYTE_GET_AND_INC (format_count, data, 1, end);
@@ -5847,6 +5847,9 @@ display_formatted_table (unsigned char *data,
case DW_LNCT_MD5:
printf (_("\tMD5\t\t\t"));
break;
+ case DW_LNCT_LLVM_source:
+ // Skip source ... display on next line.
+ break;
default:
printf (_("\t(Unknown format content type %" PRIu64 ")"),
content_type);
@@ -5861,8 +5864,9 @@ display_formatted_table (unsigned char *data,
unsigned char *datapass = data;
printf (" %d", last_entry++);
- /* Delay displaying name as the last entry for better screen layout. */
- for (namepass = 0; namepass < 2; namepass++)
+ /* Delay displaying name/source as the last entry for better screen
+ layout. */
+ for (namesourcepass = 0; namesourcepass < 3; namesourcepass++)
{
format = format_start;
data = datapass;
@@ -5872,13 +5876,29 @@ display_formatted_table (unsigned char *data,
READ_ULEB (content_type, format, end);
READ_ULEB (form, format, end);
- bool do_loc = (content_type == DW_LNCT_path) != (namepass == 1);
+
+ bool do_loc = (content_type == DW_LNCT_path)
+ != (namesourcepass == 1);
+ do_loc |= (content_type == DW_LNCT_LLVM_source)
+ != (namesourcepass == 2);
+
+ char delimiter = '\t';
+
+ /* Print Source last (if available) and print it
+ starting on the next line. */
+ if (namesourcepass == 2 && content_type == DW_LNCT_LLVM_source)
+ {
+ delimiter = ' ';
+ putchar ('\n');
+ printf (" Source:");
+ }
data = read_and_display_attr_value (0, form, 0, start, data, end,
0, linfo->li_address_size,
linfo->li_offset_size,
linfo->li_version, NULL,
- do_loc, section, NULL, '\t',
- -1, false, 0, 0, false);
+ do_loc, section, NULL,
+ delimiter, -1, false, 0, 0,
+ false);
}
}
@@ -294,6 +294,7 @@ enum dwarf_line_number_content_type
DW_LNCT_size = 0x4,
DW_LNCT_MD5 = 0x5,
DW_LNCT_lo_user = 0x2000,
+ DW_LNCT_LLVM_source = 0x2001,
DW_LNCT_hi_user = 0x3fff
};