dwarf2out: Emit DWARF 6 DW_AT_language_{name,version}
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-arm |
success
|
Build passed
|
Commit Message
Hi!
DWARF has voted in yesterday https://dwarfstd.org/issues/241209.1.html ,
which is basically just a guarantee that the DWARF 6 draft
DW_AT_language_{name,version} attribute codes and content of
https://dwarfstd.org/languages-v6.html can be used as an extension
in DWARF 5 and won't be changed.
So, this patch is an alternative to the
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669671.html
patch, which had the major problem that it required changing all the
DWARF consumers to be able to debug C17 or later or C++17 or later
sources.
This patch uses still DWARF 5 DW_LANG_C11 or DW_LANG_C_plus_plus_14,
the latest code in DWARF 5 proper, so all DWARF 5 capable consumers
should be able to deal with that, but additionally emits the
DWARF 6 attributes so that newer DWARF consumers can see it isn't
just C++14 but say C++23 or C11 but C23. Consumers which don't know
those DWARF 6 attributes would just ignore them. This is like any other
-gno-strict-dwarf extension, except that normally we emit say DWARF 5
codes where possible only after DWARF 5 is released, while in this case
there is a guarantee it can be used before DWARF 6 is released.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
2025-01-07 Jakub Jelinek <jakub@redhat.com>
include/
* dwarf2.h (enum dwarf_source_language): Fix comment pasto.
(enum dwarf_source_language_name): New type.
* dwarf2.def (DW_AT_language_name, DW_AT_language_version): New
DWARF 6 codes.
gcc/
* dwarf2out.cc (break_out_comdat_types): Copy over
DW_AT_language_{name,version} if present.
(output_skeleton_debug_sections): Remove also
DW_AT_language_{name,version}.
(gen_compile_unit_die): For C17, C23, C2Y, C++17, C++20, C++23
and C++26 emit for -gdwarf-5 -gno-strict-dwarf also
DW_AT_language_{name,version} attributes.
gcc/testsuite/
* g++.dg/debug/dwarf2/lang-cpp17.C: Add -gno-strict-dwarf to
dg-options. Check also for DW_AT_language_{name,version} values.
* g++.dg/debug/dwarf2/lang-cpp20.C: Likewise.
* g++.dg/debug/dwarf2/lang-cpp23.C: New test.
Jakub
Comments
On Tue, 7 Jan 2025, Jakub Jelinek wrote:
> Hi!
>
> DWARF has voted in yesterday https://dwarfstd.org/issues/241209.1.html ,
> which is basically just a guarantee that the DWARF 6 draft
> DW_AT_language_{name,version} attribute codes and content of
> https://dwarfstd.org/languages-v6.html can be used as an extension
> in DWARF 5 and won't be changed.
>
> So, this patch is an alternative to the
> https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669671.html
> patch, which had the major problem that it required changing all the
> DWARF consumers to be able to debug C17 or later or C++17 or later
> sources.
> This patch uses still DWARF 5 DW_LANG_C11 or DW_LANG_C_plus_plus_14,
> the latest code in DWARF 5 proper, so all DWARF 5 capable consumers
> should be able to deal with that, but additionally emits the
> DWARF 6 attributes so that newer DWARF consumers can see it isn't
> just C++14 but say C++23 or C11 but C23. Consumers which don't know
> those DWARF 6 attributes would just ignore them. This is like any other
> -gno-strict-dwarf extension, except that normally we emit say DWARF 5
> codes where possible only after DWARF 5 is released, while in this case
> there is a guarantee it can be used before DWARF 6 is released.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Ok with me.
Thanks,
Richard.
> 2025-01-07 Jakub Jelinek <jakub@redhat.com>
>
> include/
> * dwarf2.h (enum dwarf_source_language): Fix comment pasto.
> (enum dwarf_source_language_name): New type.
> * dwarf2.def (DW_AT_language_name, DW_AT_language_version): New
> DWARF 6 codes.
> gcc/
> * dwarf2out.cc (break_out_comdat_types): Copy over
> DW_AT_language_{name,version} if present.
> (output_skeleton_debug_sections): Remove also
> DW_AT_language_{name,version}.
> (gen_compile_unit_die): For C17, C23, C2Y, C++17, C++20, C++23
> and C++26 emit for -gdwarf-5 -gno-strict-dwarf also
> DW_AT_language_{name,version} attributes.
> gcc/testsuite/
> * g++.dg/debug/dwarf2/lang-cpp17.C: Add -gno-strict-dwarf to
> dg-options. Check also for DW_AT_language_{name,version} values.
> * g++.dg/debug/dwarf2/lang-cpp20.C: Likewise.
> * g++.dg/debug/dwarf2/lang-cpp23.C: New test.
>
> --- include/dwarf2.h.jj 2025-01-02 11:47:47.431981968 +0100
> +++ include/dwarf2.h 2025-01-06 18:55:59.802378204 +0100
> @@ -411,7 +411,7 @@ enum dwarf_source_language
> DW_LANG_Hylo = 0x0042,
>
> DW_LANG_lo_user = 0x8000, /* Implementation-defined range start. */
> - DW_LANG_hi_user = 0xffff, /* Implementation-defined range start. */
> + DW_LANG_hi_user = 0xffff, /* Implementation-defined range end. */
>
> /* MIPS. */
> DW_LANG_Mips_Assembler = 0x8001,
> @@ -428,6 +428,59 @@ enum dwarf_source_language
> DW_LANG_Rust_old = 0x9000
> };
>
> +/* DWARF 6 source language names and codes. */
> +enum dwarf_source_language_name
> + {
> + /* https://dwarfstd.org/languages-v6.html */
> + DW_LNAME_Ada = 0x0001,
> + DW_LNAME_BLISS = 0x0002,
> + DW_LNAME_C = 0x0003,
> + DW_LNAME_C_plus_plus = 0x0004,
> + DW_LNAME_Cobol = 0x0005,
> + DW_LNAME_Crystal = 0x0006,
> + DW_LNAME_D = 0x0007,
> + DW_LNAME_Dylan = 0x0008,
> + DW_LNAME_Fortran = 0x0009,
> + DW_LNAME_Go = 0x000a,
> + DW_LNAME_Haskell = 0x000b,
> + DW_LNAME_Java = 0x000c,
> + DW_LNAME_Julia = 0x000d,
> + DW_LNAME_Kotlin = 0x000e,
> + DW_LNAME_Modula2 = 0x000f,
> + DW_LNAME_Modula3 = 0x0010,
> + DW_LNAME_ObjC = 0x0011,
> + DW_LNAME_ObjC_plus_plus = 0x0012,
> + DW_LNAME_OCaml = 0x0013,
> + DW_LNAME_OpenCL_C = 0x0014,
> + DW_LNAME_Pascal = 0x0015,
> + DW_LNAME_PLI = 0x0016,
> + DW_LNAME_Python = 0x0017,
> + DW_LNAME_RenderScript = 0x0018,
> + DW_LNAME_Rust = 0x0019,
> + DW_LNAME_Swift = 0x001a,
> + DW_LNAME_UPC = 0x001b,
> + DW_LNAME_Zig = 0x001c,
> + DW_LNAME_Assembly = 0x001d,
> + DW_LNAME_C_sharp = 0x001e,
> + DW_LNAME_Mojo = 0x001f,
> + DW_LNAME_GLSL = 0x0020,
> + DW_LNAME_GLSL_ES = 0x0021,
> + DW_LNAME_HLSL = 0x0022,
> + DW_LNAME_OpenCL_CPP = 0x0023,
> + DW_LNAME_CPP_for_OpenCL = 0x0024,
> + DW_LNAME_SYCL = 0x0025,
> + DW_LNAME_Ruby = 0x0026,
> + DW_LNAME_Move = 0x0027,
> + DW_LNAME_Hylo = 0x0028,
> + DW_LNAME_HIP = 0x0029,
> + DW_LNAME_Odin = 0x002a,
> + DW_LNAME_P4 = 0x002b,
> + DW_LNAME_Metal = 0x002c,
> +
> + DW_LNAME_lo_user = 0x8000, /* Implementation-defined range start. */
> + DW_LNAME_hi_user = 0xffff /* Implementation-defined range end. */
> + };
> +
> /* Names and codes for macro information. */
> enum dwarf_macinfo_record_type
> {
> --- include/dwarf2.def.jj 2025-01-02 11:47:47.191985318 +0100
> +++ include/dwarf2.def 2025-01-06 18:39:02.642383150 +0100
> @@ -364,6 +364,9 @@ DW_AT (DW_AT_export_symbols, 0x89)
> DW_AT (DW_AT_deleted, 0x8a)
> DW_AT (DW_AT_defaulted, 0x8b)
> DW_AT (DW_AT_loclists_base, 0x8c)
> +/* DWARF 6. */
> +DW_AT (DW_AT_language_name, 0x90)
> +DW_AT (DW_AT_language_version, 0x91)
>
> DW_AT_DUP (DW_AT_lo_user, 0x2000) /* Implementation-defined range start. */
> DW_AT_DUP (DW_AT_hi_user, 0x3fff) /* Implementation-defined range end. */
> --- gcc/dwarf2out.cc.jj 2025-01-02 11:23:35.541251268 +0100
> +++ gcc/dwarf2out.cc 2025-01-07 10:09:16.866866563 +0100
> @@ -8755,6 +8755,14 @@ break_out_comdat_types (dw_die_ref die)
> unit = new_die (DW_TAG_type_unit, NULL, NULL);
> add_AT_unsigned (unit, DW_AT_language,
> get_AT_unsigned (comp_unit_die (), DW_AT_language));
> + if (unsigned lname = get_AT_unsigned (comp_unit_die (),
> + DW_AT_language_name))
> + {
> + add_AT_unsigned (unit, DW_AT_language_name, lname);
> + add_AT_unsigned (unit, DW_AT_language_version,
> + get_AT_unsigned (comp_unit_die (),
> + DW_AT_language_version));
> + }
>
> /* Add the new unit's type DIE into the comdat type list. */
> type_node = ggc_cleared_alloc<comdat_type_node> ();
> @@ -11404,6 +11412,8 @@ output_skeleton_debug_sections (dw_die_r
> /* These attributes will be found in the full debug_info section. */
> remove_AT (comp_unit, DW_AT_producer);
> remove_AT (comp_unit, DW_AT_language);
> + remove_AT (comp_unit, DW_AT_language_name);
> + remove_AT (comp_unit, DW_AT_language_version);
>
> switch_to_section (debug_skeleton_info_section);
> ASM_OUTPUT_LABEL (asm_out_file, debug_skeleton_info_section_label);
> @@ -25318,7 +25328,7 @@ gen_compile_unit_die (const char *filena
> {
> dw_die_ref die;
> const char *language_string = lang_hooks.name;
> - int language;
> + int language, lname, lversion;
>
> die = new_die (DW_TAG_compile_unit, NULL, NULL);
>
> @@ -25366,6 +25376,8 @@ gen_compile_unit_die (const char *filena
> }
>
> language = DW_LANG_C;
> + lname = 0;
> + lversion = 0;
> if (startswith (language_string, "GNU C")
> && ISDIGIT (language_string[5]))
> {
> @@ -25376,11 +25388,28 @@ gen_compile_unit_die (const char *filena
> language = DW_LANG_C99;
>
> if (dwarf_version >= 5 /* || !dwarf_strict */)
> - if (strcmp (language_string, "GNU C11") == 0
> - || strcmp (language_string, "GNU C17") == 0
> - || strcmp (language_string, "GNU C23") == 0
> - || strcmp (language_string, "GNU C2Y") == 0)
> - language = DW_LANG_C11;
> + {
> + if (strcmp (language_string, "GNU C11") == 0)
> + language = DW_LANG_C11;
> + else if (strcmp (language_string, "GNU C17") == 0)
> + {
> + language = DW_LANG_C11;
> + lname = DW_LNAME_C;
> + lversion = 201710;
> + }
> + else if (strcmp (language_string, "GNU C23") == 0)
> + {
> + language = DW_LANG_C11;
> + lname = DW_LNAME_C;
> + lversion = 202311;
> + }
> + else if (strcmp (language_string, "GNU C2Y") == 0)
> + {
> + language = DW_LANG_C11;
> + lname = DW_LNAME_C;
> + lversion = 202500;
> + }
> + }
> }
> }
> else if (startswith (language_string, "GNU C++"))
> @@ -25392,12 +25421,30 @@ gen_compile_unit_die (const char *filena
> language = DW_LANG_C_plus_plus_11;
> else if (strcmp (language_string, "GNU C++14") == 0)
> language = DW_LANG_C_plus_plus_14;
> - else if (strcmp (language_string, "GNU C++17") == 0
> - || strcmp (language_string, "GNU C++20") == 0
> - || strcmp (language_string, "GNU C++23") == 0
> - || strcmp (language_string, "GNU C++26") == 0)
> - /* For now. */
> - language = DW_LANG_C_plus_plus_14;
> + else if (strcmp (language_string, "GNU C++17") == 0)
> + {
> + language = DW_LANG_C_plus_plus_14;
> + lname = DW_LNAME_C_plus_plus;
> + lversion = 201703;
> + }
> + else if (strcmp (language_string, "GNU C++20") == 0)
> + {
> + language = DW_LANG_C_plus_plus_14;
> + lname = DW_LNAME_C_plus_plus;
> + lversion = 202002;
> + }
> + else if (strcmp (language_string, "GNU C++23") == 0)
> + {
> + language = DW_LANG_C_plus_plus_14;
> + lname = DW_LNAME_C_plus_plus;
> + lversion = 202302;
> + }
> + else if (strcmp (language_string, "GNU C++26") == 0)
> + {
> + language = DW_LANG_C_plus_plus_14;
> + lname = DW_LNAME_C_plus_plus;
> + lversion = 202400;
> + }
> }
> }
> else if (strcmp (language_string, "GNU F77") == 0)
> @@ -25441,6 +25488,11 @@ gen_compile_unit_die (const char *filena
> language = DW_LANG_Ada83;
>
> add_AT_unsigned (die, DW_AT_language, language);
> + if (lname && dwarf_version >= 5 && !dwarf_strict)
> + {
> + add_AT_unsigned (die, DW_AT_language_name, lname);
> + add_AT_unsigned (die, DW_AT_language_version, lversion);
> + }
>
> switch (language)
> {
> --- gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp17.C.jj 2021-01-18 07:18:14.929659650 +0100
> +++ gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp17.C 2025-01-07 10:07:46.473125326 +0100
> @@ -1,8 +1,10 @@
> // { dg-do compile }
> -// { dg-options "-O -std=c++17 -gdwarf-5 -dA" }
> +// { dg-options "-O -std=c++17 -gdwarf-5 -dA -gno-strict-dwarf" }
> // { dg-skip-if "AIX DWARF5" { powerpc-ibm-aix* } }
> -// For -gdwarf-6 hopefully DW_LANG_C_plus_plus_17
> // DW_LANG_C_plus_plus_14 = 0x0021
> +// DW_LNAME_C_plus_plus = 0x0004 201703
> // { dg-final { scan-assembler "0x21\[^\n\r]* DW_AT_language" } } */
> +// { dg-final { scan-assembler "0x4\[^\n\r]* DW_AT_language_name" } } */
> +// { dg-final { scan-assembler "0x313e7\[^\n\r]* DW_AT_language_version" } } */
>
> int version;
> --- gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp20.C.jj 2021-01-18 14:52:42.946040137 +0100
> +++ gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp20.C 2025-01-07 10:08:28.982533366 +0100
> @@ -1,8 +1,10 @@
> // { dg-do compile }
> -// { dg-options "-O -std=c++20 -gdwarf-5 -dA" }
> +// { dg-options "-O -std=c++20 -gdwarf-5 -dA -gno-strict-dwarf" }
> // { dg-skip-if "AIX DWARF5" { powerpc-ibm-aix* } }
> -// For -gdwarf-6 hopefully DW_LANG_C_plus_plus_20
> // DW_LANG_C_plus_plus_14 = 0x0021
> +// DW_LNAME_C_plus_plus = 0x0004 202002
> // { dg-final { scan-assembler "0x21\[^\n\r]* DW_AT_language" } } */
> +// { dg-final { scan-assembler "0x4\[^\n\r]* DW_AT_language_name" } } */
> +// { dg-final { scan-assembler "0x31512\[^\n\r]* DW_AT_language_version" } } */
>
> int version;
> --- gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp23.C.jj 2025-01-07 10:07:54.926007612 +0100
> +++ gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp23.C 2025-01-07 10:08:19.206669497 +0100
> @@ -0,0 +1,10 @@
> +// { dg-do compile }
> +// { dg-options "-O -std=c++23 -gdwarf-5 -dA -gno-strict-dwarf" }
> +// { dg-skip-if "AIX DWARF5" { powerpc-ibm-aix* } }
> +// DW_LANG_C_plus_plus_14 = 0x0021
> +// DW_LNAME_C_plus_plus = 0x0004 202302
> +// { dg-final { scan-assembler "0x21\[^\n\r]* DW_AT_language" } } */
> +// { dg-final { scan-assembler "0x4\[^\n\r]* DW_AT_language_name" } } */
> +// { dg-final { scan-assembler "0x3163e\[^\n\r]* DW_AT_language_version" } } */
> +
> +int version;
>
> Jakub
>
>
> So, this patch is an alternative to the
> https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669671.html
> patch, which had the major problem that it required changing all the
> DWARF consumers to be able to debug C17 or later or C++17 or later
> sources.
Do you plan to salvage the non-obsoleted parts of the above change?
On Wed, Jan 08, 2025 at 09:14:59AM +0100, Eric Botcazou wrote:
> > So, this patch is an alternative to the
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669671.html
> > patch, which had the major problem that it required changing all the
> > DWARF consumers to be able to debug C17 or later or C++17 or later
> > sources.
>
> Do you plan to salvage the non-obsoleted parts of the above change?
No. The switches on DW_AT_language value are a GCC internal thing,
if we never generate say the post-DWARF5 DW_LANG_C_plus_plus_23,
we don't need to handle it in the switch.
Though, Ada and Fortran could have a similar change to the C/C++
one, i.e. also add DW_AT_language_{name,version} for newer Ada and Fortran
versions, say DW_LNAME_Ada and then dunno whether 2005, 2012 and 2022
or 2007, 2012 and 2023, in https://dwarfstd.org/languages-v6.html
the versioning scheme for Ada (as well as Fortran) is YYYY.
Similarly DW_LNAME_Fortran 2018 and 2023.
The reason I haven't done it myself is that the Ada FE doesn't tell
Ada version at all - there is just "GNU Ada" and dwarf2out.cc right now
implies it is Ada 95 for DWARF3 and Ada 83 otherwise.
And in the Fortran case, while the FE provides a version, it only
does so for Fortran 2003 and 2008 (and just "GNU Fortran" implies Fortran 90
in dwarf2out). So, no marking of Fortran 2018 or Fortran 2023.
Jakub
May be add a feature request PR for this for Ada and Fortran to prevent us
from forgetting to evaluate the necessity or ability to provide that
flag/information?
Sorry for me using a wrong term or expression above. I have not yet any
knowledge or gotten in touch with DWARF stuff.
- Andre
On Wed, 8 Jan 2025 09:37:34 +0100
Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Jan 08, 2025 at 09:14:59AM +0100, Eric Botcazou wrote:
> > > So, this patch is an alternative to the
> > > https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669671.html
> > > patch, which had the major problem that it required changing all the
> > > DWARF consumers to be able to debug C17 or later or C++17 or later
> > > sources.
> >
> > Do you plan to salvage the non-obsoleted parts of the above change?
>
> No. The switches on DW_AT_language value are a GCC internal thing,
> if we never generate say the post-DWARF5 DW_LANG_C_plus_plus_23,
> we don't need to handle it in the switch.
>
> Though, Ada and Fortran could have a similar change to the C/C++
> one, i.e. also add DW_AT_language_{name,version} for newer Ada and Fortran
> versions, say DW_LNAME_Ada and then dunno whether 2005, 2012 and 2022
> or 2007, 2012 and 2023, in https://dwarfstd.org/languages-v6.html
> the versioning scheme for Ada (as well as Fortran) is YYYY.
> Similarly DW_LNAME_Fortran 2018 and 2023.
>
> The reason I haven't done it myself is that the Ada FE doesn't tell
> Ada version at all - there is just "GNU Ada" and dwarf2out.cc right now
> implies it is Ada 95 for DWARF3 and Ada 83 otherwise.
> And in the Fortran case, while the FE provides a version, it only
> does so for Fortran 2003 and 2008 (and just "GNU Fortran" implies Fortran 90
> in dwarf2out). So, no marking of Fortran 2018 or Fortran 2023.
>
> Jakub
>
--
Andre Vehreschild * Email: vehre ad gmx dot de
On Wed, 8 Jan 2025, Jakub Jelinek wrote:
> On Wed, Jan 08, 2025 at 09:14:59AM +0100, Eric Botcazou wrote:
> > > So, this patch is an alternative to the
> > > https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669671.html
> > > patch, which had the major problem that it required changing all the
> > > DWARF consumers to be able to debug C17 or later or C++17 or later
> > > sources.
> >
> > Do you plan to salvage the non-obsoleted parts of the above change?
>
> No. The switches on DW_AT_language value are a GCC internal thing,
> if we never generate say the post-DWARF5 DW_LANG_C_plus_plus_23,
> we don't need to handle it in the switch.
>
> Though, Ada and Fortran could have a similar change to the C/C++
> one, i.e. also add DW_AT_language_{name,version} for newer Ada and Fortran
> versions, say DW_LNAME_Ada and then dunno whether 2005, 2012 and 2022
> or 2007, 2012 and 2023, in https://dwarfstd.org/languages-v6.html
> the versioning scheme for Ada (as well as Fortran) is YYYY.
> Similarly DW_LNAME_Fortran 2018 and 2023.
>
> The reason I haven't done it myself is that the Ada FE doesn't tell
> Ada version at all - there is just "GNU Ada" and dwarf2out.cc right now
> implies it is Ada 95 for DWARF3 and Ada 83 otherwise.
> And in the Fortran case, while the FE provides a version, it only
> does so for Fortran 2003 and 2008 (and just "GNU Fortran" implies Fortran 90
> in dwarf2out). So, no marking of Fortran 2018 or Fortran 2023.
Btw, with DWARF only we now have the opportunity to add langhooks that
directly create DWARF DIEs or attributes (for early debug).
Richard.
Hi Jakub,
On Tue, 2025-01-07 at 20:22 +0100, Jakub Jelinek wrote:
> DWARF has voted in yesterday https://dwarfstd.org/issues/241209.1.html ,
> which is basically just a guarantee that the DWARF 6 draft
> DW_AT_language_{name,version} attribute codes and content of
> https://dwarfstd.org/languages-v6.html can be used as an extension
> in DWARF 5 and won't be changed.
>
> So, this patch is an alternative to the
> https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669671.html
> patch, which had the major problem that it required changing all the
> DWARF consumers to be able to debug C17 or later or C++17 or later
> sources.
Note that most consumers (binutils, gdb, systemtap, valgrind, elfutils)
have already been updated to use the new DW_LANG constants. We can
easily backport that to the last stable releases before gcc 15 is
released.
> This patch uses still DWARF 5 DW_LANG_C11 or DW_LANG_C_plus_plus_14,
> the latest code in DWARF 5 proper, so all DWARF 5 capable consumers
> should be able to deal with that, but additionally emits the
> DWARF 6 attributes so that newer DWARF consumers can see it isn't
> just C++14 but say C++23 or C11 but C23. Consumers which don't know
> those DWARF 6 attributes would just ignore them. This is like any other
> -gno-strict-dwarf extension, except that normally we emit say DWARF 5
> codes where possible only after DWARF 5 is released, while in this case
> there is a guarantee it can be used before DWARF 6 is released.
Code looks ok to me, it is only for C and C++, would be nice to at
least get it for Fortran and Ada too.
> --- gcc/dwarf2out.cc.jj 2025-01-02 11:23:35.541251268 +0100
> +++ gcc/dwarf2out.cc 2025-01-07 10:09:16.866866563 +0100
> @@ -8755,6 +8755,14 @@ break_out_comdat_types (dw_die_ref die)
> unit = new_die (DW_TAG_type_unit, NULL, NULL);
> add_AT_unsigned (unit, DW_AT_language,
> get_AT_unsigned (comp_unit_die (), DW_AT_language));
> + if (unsigned lname = get_AT_unsigned (comp_unit_die (),
> + DW_AT_language_name))
> + {
> + add_AT_unsigned (unit, DW_AT_language_name, lname);
> + add_AT_unsigned (unit, DW_AT_language_version,
> + get_AT_unsigned (comp_unit_die (),
> + DW_AT_language_version));
> + }
This relies on language_name and language_version always being set
together. Which is the case in the code at this time. Should we assert
that?
> @@ -25376,11 +25388,28 @@ gen_compile_unit_die (const char *filena
> language = DW_LANG_C99;
>
> if (dwarf_version >= 5 /* || !dwarf_strict */)
> - if (strcmp (language_string, "GNU C11") == 0
> - || strcmp (language_string, "GNU C17") == 0
> - || strcmp (language_string, "GNU C23") == 0
> - || strcmp (language_string, "GNU C2Y") == 0)
> - language = DW_LANG_C11;
> + {
> + if (strcmp (language_string, "GNU C11") == 0)
> + language = DW_LANG_C11;
> + else if (strcmp (language_string, "GNU C17") == 0)
> + {
> + language = DW_LANG_C11;
> + lname = DW_LNAME_C;
> + lversion = 201710;
> + }
> + else if (strcmp (language_string, "GNU C23") == 0)
> + {
> + language = DW_LANG_C11;
> + lname = DW_LNAME_C;
> + lversion = 202311;
> + }
> + else if (strcmp (language_string, "GNU C2Y") == 0)
> + {
> + language = DW_LANG_C11;
> + lname = DW_LNAME_C;
> + lversion = 202500;
> + }
> + }
The use of language_string (lang_hooks.name) is a little clumsy, but
already part of the current code. In the future it would be nice to
have separate language and version hooks for this.
Personally I would just go with the new DW_LANG_C17/C23 language code,
they will be supported by consumers when gcc 15 is released.
The 00 for the month in C2Y is clever. Does that match the
__STDC_VERSION__ defined?
> @@ -25392,12 +25421,30 @@ gen_compile_unit_die (const char *filena
> language = DW_LANG_C_plus_plus_11;
> else if (strcmp (language_string, "GNU C++14") == 0)
> language = DW_LANG_C_plus_plus_14;
> - else if (strcmp (language_string, "GNU C++17") == 0
> - || strcmp (language_string, "GNU C++20") == 0
> - || strcmp (language_string, "GNU C++23") == 0
> - || strcmp (language_string, "GNU C++26") == 0)
> - /* For now. */
> - language = DW_LANG_C_plus_plus_14;
> + else if (strcmp (language_string, "GNU C++17") == 0)
> + {
> + language = DW_LANG_C_plus_plus_14;
> + lname = DW_LNAME_C_plus_plus;
> + lversion = 201703;
> + }
> + else if (strcmp (language_string, "GNU C++20") == 0)
> + {
> + language = DW_LANG_C_plus_plus_14;
> + lname = DW_LNAME_C_plus_plus;
> + lversion = 202002;
> + }
> + else if (strcmp (language_string, "GNU C++23") == 0)
> + {
> + language = DW_LANG_C_plus_plus_14;
> + lname = DW_LNAME_C_plus_plus;
> + lversion = 202302;
> + }
> + else if (strcmp (language_string, "GNU C++26") == 0)
> + {
> + language = DW_LANG_C_plus_plus_14;
> + lname = DW_LNAME_C_plus_plus;
> + lversion = 202400;
> + }
Why 202400 and not 202600?
Cheers,
Mark
On Wed, Jan 08, 2025 at 02:35:56PM +0100, Mark Wielaard wrote:
> > --- gcc/dwarf2out.cc.jj 2025-01-02 11:23:35.541251268 +0100
> > +++ gcc/dwarf2out.cc 2025-01-07 10:09:16.866866563 +0100
> > @@ -8755,6 +8755,14 @@ break_out_comdat_types (dw_die_ref die)
> > unit = new_die (DW_TAG_type_unit, NULL, NULL);
> > add_AT_unsigned (unit, DW_AT_language,
> > get_AT_unsigned (comp_unit_die (), DW_AT_language));
> > + if (unsigned lname = get_AT_unsigned (comp_unit_die (),
> > + DW_AT_language_name))
> > + {
> > + add_AT_unsigned (unit, DW_AT_language_name, lname);
> > + add_AT_unsigned (unit, DW_AT_language_version,
> > + get_AT_unsigned (comp_unit_die (),
> > + DW_AT_language_version));
> > + }
>
> This relies on language_name and language_version always being set
> together. Which is the case in the code at this time. Should we assert
> that?
Worst case it will emit the useless 0 in there.
I think we should never emit DW_AT_language_version without
DW_AT_language_name, but the other way around is possible, although
only at DWARF 6 time for languages which don't care about version.
For DWARF 5 we wouldn't bother with DW_AT_language_name in that case
and just emit DW_AT_language.
> The use of language_string (lang_hooks.name) is a little clumsy, but
> already part of the current code. In the future it would be nice to
> have separate language and version hooks for this.
The advantage of that is that it also shows up in human readable form
in DW_AT_producer that way.
> Personally I would just go with the new DW_LANG_C17/C23 language code,
> they will be supported by consumers when gcc 15 is released.
As discussed, I don't think that is a good idea, because some people won't
update all the tools and just use a newer gcc. And the advantages of
DW_LANG_C23/DW_LANG_C_plus_plus17 definitely don't outweight the pain if
using older tools (where debugging becomes really clumsy).
> The 00 for the month in C2Y is clever. Does that match the
> __STDC_VERSION__ defined?
Yes.
> Why 202400 and not 202600?
Because that is what __cplusplus is defined to for C++26 right now.
Newer than 202302 and 00 month indicates that it is just a temporary thing.
Jakub
@@ -411,7 +411,7 @@ enum dwarf_source_language
DW_LANG_Hylo = 0x0042,
DW_LANG_lo_user = 0x8000, /* Implementation-defined range start. */
- DW_LANG_hi_user = 0xffff, /* Implementation-defined range start. */
+ DW_LANG_hi_user = 0xffff, /* Implementation-defined range end. */
/* MIPS. */
DW_LANG_Mips_Assembler = 0x8001,
@@ -428,6 +428,59 @@ enum dwarf_source_language
DW_LANG_Rust_old = 0x9000
};
+/* DWARF 6 source language names and codes. */
+enum dwarf_source_language_name
+ {
+ /* https://dwarfstd.org/languages-v6.html */
+ DW_LNAME_Ada = 0x0001,
+ DW_LNAME_BLISS = 0x0002,
+ DW_LNAME_C = 0x0003,
+ DW_LNAME_C_plus_plus = 0x0004,
+ DW_LNAME_Cobol = 0x0005,
+ DW_LNAME_Crystal = 0x0006,
+ DW_LNAME_D = 0x0007,
+ DW_LNAME_Dylan = 0x0008,
+ DW_LNAME_Fortran = 0x0009,
+ DW_LNAME_Go = 0x000a,
+ DW_LNAME_Haskell = 0x000b,
+ DW_LNAME_Java = 0x000c,
+ DW_LNAME_Julia = 0x000d,
+ DW_LNAME_Kotlin = 0x000e,
+ DW_LNAME_Modula2 = 0x000f,
+ DW_LNAME_Modula3 = 0x0010,
+ DW_LNAME_ObjC = 0x0011,
+ DW_LNAME_ObjC_plus_plus = 0x0012,
+ DW_LNAME_OCaml = 0x0013,
+ DW_LNAME_OpenCL_C = 0x0014,
+ DW_LNAME_Pascal = 0x0015,
+ DW_LNAME_PLI = 0x0016,
+ DW_LNAME_Python = 0x0017,
+ DW_LNAME_RenderScript = 0x0018,
+ DW_LNAME_Rust = 0x0019,
+ DW_LNAME_Swift = 0x001a,
+ DW_LNAME_UPC = 0x001b,
+ DW_LNAME_Zig = 0x001c,
+ DW_LNAME_Assembly = 0x001d,
+ DW_LNAME_C_sharp = 0x001e,
+ DW_LNAME_Mojo = 0x001f,
+ DW_LNAME_GLSL = 0x0020,
+ DW_LNAME_GLSL_ES = 0x0021,
+ DW_LNAME_HLSL = 0x0022,
+ DW_LNAME_OpenCL_CPP = 0x0023,
+ DW_LNAME_CPP_for_OpenCL = 0x0024,
+ DW_LNAME_SYCL = 0x0025,
+ DW_LNAME_Ruby = 0x0026,
+ DW_LNAME_Move = 0x0027,
+ DW_LNAME_Hylo = 0x0028,
+ DW_LNAME_HIP = 0x0029,
+ DW_LNAME_Odin = 0x002a,
+ DW_LNAME_P4 = 0x002b,
+ DW_LNAME_Metal = 0x002c,
+
+ DW_LNAME_lo_user = 0x8000, /* Implementation-defined range start. */
+ DW_LNAME_hi_user = 0xffff /* Implementation-defined range end. */
+ };
+
/* Names and codes for macro information. */
enum dwarf_macinfo_record_type
{
@@ -364,6 +364,9 @@ DW_AT (DW_AT_export_symbols, 0x89)
DW_AT (DW_AT_deleted, 0x8a)
DW_AT (DW_AT_defaulted, 0x8b)
DW_AT (DW_AT_loclists_base, 0x8c)
+/* DWARF 6. */
+DW_AT (DW_AT_language_name, 0x90)
+DW_AT (DW_AT_language_version, 0x91)
DW_AT_DUP (DW_AT_lo_user, 0x2000) /* Implementation-defined range start. */
DW_AT_DUP (DW_AT_hi_user, 0x3fff) /* Implementation-defined range end. */
@@ -8755,6 +8755,14 @@ break_out_comdat_types (dw_die_ref die)
unit = new_die (DW_TAG_type_unit, NULL, NULL);
add_AT_unsigned (unit, DW_AT_language,
get_AT_unsigned (comp_unit_die (), DW_AT_language));
+ if (unsigned lname = get_AT_unsigned (comp_unit_die (),
+ DW_AT_language_name))
+ {
+ add_AT_unsigned (unit, DW_AT_language_name, lname);
+ add_AT_unsigned (unit, DW_AT_language_version,
+ get_AT_unsigned (comp_unit_die (),
+ DW_AT_language_version));
+ }
/* Add the new unit's type DIE into the comdat type list. */
type_node = ggc_cleared_alloc<comdat_type_node> ();
@@ -11404,6 +11412,8 @@ output_skeleton_debug_sections (dw_die_r
/* These attributes will be found in the full debug_info section. */
remove_AT (comp_unit, DW_AT_producer);
remove_AT (comp_unit, DW_AT_language);
+ remove_AT (comp_unit, DW_AT_language_name);
+ remove_AT (comp_unit, DW_AT_language_version);
switch_to_section (debug_skeleton_info_section);
ASM_OUTPUT_LABEL (asm_out_file, debug_skeleton_info_section_label);
@@ -25318,7 +25328,7 @@ gen_compile_unit_die (const char *filena
{
dw_die_ref die;
const char *language_string = lang_hooks.name;
- int language;
+ int language, lname, lversion;
die = new_die (DW_TAG_compile_unit, NULL, NULL);
@@ -25366,6 +25376,8 @@ gen_compile_unit_die (const char *filena
}
language = DW_LANG_C;
+ lname = 0;
+ lversion = 0;
if (startswith (language_string, "GNU C")
&& ISDIGIT (language_string[5]))
{
@@ -25376,11 +25388,28 @@ gen_compile_unit_die (const char *filena
language = DW_LANG_C99;
if (dwarf_version >= 5 /* || !dwarf_strict */)
- if (strcmp (language_string, "GNU C11") == 0
- || strcmp (language_string, "GNU C17") == 0
- || strcmp (language_string, "GNU C23") == 0
- || strcmp (language_string, "GNU C2Y") == 0)
- language = DW_LANG_C11;
+ {
+ if (strcmp (language_string, "GNU C11") == 0)
+ language = DW_LANG_C11;
+ else if (strcmp (language_string, "GNU C17") == 0)
+ {
+ language = DW_LANG_C11;
+ lname = DW_LNAME_C;
+ lversion = 201710;
+ }
+ else if (strcmp (language_string, "GNU C23") == 0)
+ {
+ language = DW_LANG_C11;
+ lname = DW_LNAME_C;
+ lversion = 202311;
+ }
+ else if (strcmp (language_string, "GNU C2Y") == 0)
+ {
+ language = DW_LANG_C11;
+ lname = DW_LNAME_C;
+ lversion = 202500;
+ }
+ }
}
}
else if (startswith (language_string, "GNU C++"))
@@ -25392,12 +25421,30 @@ gen_compile_unit_die (const char *filena
language = DW_LANG_C_plus_plus_11;
else if (strcmp (language_string, "GNU C++14") == 0)
language = DW_LANG_C_plus_plus_14;
- else if (strcmp (language_string, "GNU C++17") == 0
- || strcmp (language_string, "GNU C++20") == 0
- || strcmp (language_string, "GNU C++23") == 0
- || strcmp (language_string, "GNU C++26") == 0)
- /* For now. */
- language = DW_LANG_C_plus_plus_14;
+ else if (strcmp (language_string, "GNU C++17") == 0)
+ {
+ language = DW_LANG_C_plus_plus_14;
+ lname = DW_LNAME_C_plus_plus;
+ lversion = 201703;
+ }
+ else if (strcmp (language_string, "GNU C++20") == 0)
+ {
+ language = DW_LANG_C_plus_plus_14;
+ lname = DW_LNAME_C_plus_plus;
+ lversion = 202002;
+ }
+ else if (strcmp (language_string, "GNU C++23") == 0)
+ {
+ language = DW_LANG_C_plus_plus_14;
+ lname = DW_LNAME_C_plus_plus;
+ lversion = 202302;
+ }
+ else if (strcmp (language_string, "GNU C++26") == 0)
+ {
+ language = DW_LANG_C_plus_plus_14;
+ lname = DW_LNAME_C_plus_plus;
+ lversion = 202400;
+ }
}
}
else if (strcmp (language_string, "GNU F77") == 0)
@@ -25441,6 +25488,11 @@ gen_compile_unit_die (const char *filena
language = DW_LANG_Ada83;
add_AT_unsigned (die, DW_AT_language, language);
+ if (lname && dwarf_version >= 5 && !dwarf_strict)
+ {
+ add_AT_unsigned (die, DW_AT_language_name, lname);
+ add_AT_unsigned (die, DW_AT_language_version, lversion);
+ }
switch (language)
{
@@ -1,8 +1,10 @@
// { dg-do compile }
-// { dg-options "-O -std=c++17 -gdwarf-5 -dA" }
+// { dg-options "-O -std=c++17 -gdwarf-5 -dA -gno-strict-dwarf" }
// { dg-skip-if "AIX DWARF5" { powerpc-ibm-aix* } }
-// For -gdwarf-6 hopefully DW_LANG_C_plus_plus_17
// DW_LANG_C_plus_plus_14 = 0x0021
+// DW_LNAME_C_plus_plus = 0x0004 201703
// { dg-final { scan-assembler "0x21\[^\n\r]* DW_AT_language" } } */
+// { dg-final { scan-assembler "0x4\[^\n\r]* DW_AT_language_name" } } */
+// { dg-final { scan-assembler "0x313e7\[^\n\r]* DW_AT_language_version" } } */
int version;
@@ -1,8 +1,10 @@
// { dg-do compile }
-// { dg-options "-O -std=c++20 -gdwarf-5 -dA" }
+// { dg-options "-O -std=c++20 -gdwarf-5 -dA -gno-strict-dwarf" }
// { dg-skip-if "AIX DWARF5" { powerpc-ibm-aix* } }
-// For -gdwarf-6 hopefully DW_LANG_C_plus_plus_20
// DW_LANG_C_plus_plus_14 = 0x0021
+// DW_LNAME_C_plus_plus = 0x0004 202002
// { dg-final { scan-assembler "0x21\[^\n\r]* DW_AT_language" } } */
+// { dg-final { scan-assembler "0x4\[^\n\r]* DW_AT_language_name" } } */
+// { dg-final { scan-assembler "0x31512\[^\n\r]* DW_AT_language_version" } } */
int version;
@@ -0,0 +1,10 @@
+// { dg-do compile }
+// { dg-options "-O -std=c++23 -gdwarf-5 -dA -gno-strict-dwarf" }
+// { dg-skip-if "AIX DWARF5" { powerpc-ibm-aix* } }
+// DW_LANG_C_plus_plus_14 = 0x0021
+// DW_LNAME_C_plus_plus = 0x0004 202302
+// { dg-final { scan-assembler "0x21\[^\n\r]* DW_AT_language" } } */
+// { dg-final { scan-assembler "0x4\[^\n\r]* DW_AT_language_name" } } */
+// { dg-final { scan-assembler "0x3163e\[^\n\r]* DW_AT_language_version" } } */
+
+int version;