[v2] gdb: Support embedded source in DWARF

Message ID 20240403165713.452036-1-hawkinsw@obs.cr
State Dropped, archived
Headers
Series [v2] gdb: Support embedded source in DWARF |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm fail Testing failed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 fail Testing failed

Commit Message

Will Hawkins April 3, 2024, 4:57 p.m. UTC
  While DW_LNCT_source is not yet finalized in the DWARF standard
(https://dwarfstd.org/issues/180201.1.html), LLVM does emit it.

This patch adds support for it in gdb.

Tested on x86_64-redhat-linux.

Signed-off-by: Will Hawkins <hawkinsw@obs.cr>
---

Notes:
    v1 -> v2
      - Address feedback from original PR
      - Add support for maintenance commands to see embedded source status
      - Prevent access to the filesystem for symtabs with embedded source
      - Add additional unit tests

 gdb/dwarf2/file-and-dir.h                    | 26 +++++-
 gdb/dwarf2/line-header.c                     | 22 +++--
 gdb/dwarf2/line-header.h                     |  9 +-
 gdb/dwarf2/read.c                            | 72 ++++++++++++++--
 gdb/source-cache.c                           | 48 ++++++-----
 gdb/source.c                                 | 70 +++++++++++-----
 gdb/source.h                                 |  4 +
 gdb/symmisc.c                                | 13 ++-
 gdb/symtab.h                                 |  2 +
 gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c   | 24 ++++++
 gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp | 88 ++++++++++++++++++++
 gdb/testsuite/lib/dwarf.exp                  | 19 +++--
 include/dwarf2.h                             |  1 +
 13 files changed, 331 insertions(+), 67 deletions(-)
 create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c
 create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp
  

Comments

Tom Tromey April 3, 2024, 8:34 p.m. UTC | #1
>>>>> "Will" == Will Hawkins <hawkinsw@obs.cr> writes:

Will> While DW_LNCT_source is not yet finalized in the DWARF standard
Will> (https://dwarfstd.org/issues/180201.1.html), LLVM does emit it.

Will> This patch adds support for it in gdb.

Will> Tested on x86_64-redhat-linux.

Will> Signed-off-by: Will Hawkins <hawkinsw@obs.cr>

Thanks for the patch.
It looks basically reasonable to me but I found a few oddities and then
also some of the normal stylistic nits.

Will> +      m_is_embedded (false)

It's better to use inline initialization, like:

Will> +  /* Whether the file's source is embedded in the dwarf.  */
Will> +  bool m_is_embedded;

   bool m_is_embedded = false;

Will> +	    case DW_LNCT_LLVM_SOURCE:
Will> +	      if (string.has_value () && strlen (*string))
Will> +		fe.source = *string;
Will> +	      break;

If the source can be large then it seems bad to spend time computing the
strlen, when just checking if the first byte is '\0' gives the same
answer in constant time.

Will> +  /* Whether or not the file names refer to sources embedded
Will> +     in the dwarf.  */
Will> +  bool *embeddeds;

Probably should be 'const bool *' since IIUC this is set up once and
then never written to again.

Will> +	      if (entry.source != NULL)
Will> +		embeddeds.push_back (false);
Will> +	      else
Will> +		embeddeds.push_back(true);

embeddeds.push_back (entry.source == nullptr);

Will> +  qfn->embeddeds =
Will> +    XOBNEWVEC (&per_objfile->per_bfd->obstack, bool,

gdb puts '=' on the continuation line.

Will> +	fullname.reset (embedded_fullname (dirname, qfn->file_names[index]));

embedded_fullname should return a unique_xmalloc_ptr so that this spot
can be an assignment.  Maybe one spot needs a '.release ()' then.

It's better to return these self-managing objects, since it makes leaks
less likely.

Will> +  /* Because the line header may tell us information about the CU
Will> +     filename (e.g., whether it is embedded) which will affect other
Will> +     calculations, we have to read that information here.  */
Will> +  line_header_up lh;
Will> +  lh.reset (cu->line_header);

This will wind up deleting cu->line_header, which I assume is not
intended.  For one thing, dwarf2_cu::line_header seems to have a kind of
complicated ownership situation.

In gdb the "_up" suffix on a type means "unique pointer"; i.e., 'lh'
will delete the memory when its destructor is run.

Will> +	  const char *include_name =
Will> +	    compute_include_file_name (lh.get (), entry, res, name_holder);

'=' on 2nd line.

Will> +	  if (!include_name )

gdb tends to spell this out, like: include_name == nullptr

Will> +	{
Will> +	  sf->symtab = allocate_symtab (cust, sf->name.c_str (),
Will> +					sf->name_for_id.c_str ());
Will> +	  if (fe.source)
Will> +	    sf->symtab->source = cu->per_objfile->objfile->intern (fe.source);

You don't want to intern the source here.  That makes a copy.  If the
source is in one of the debug sections, those are mapped during DWARF
reading and not unmapped until the per-BFD object is destroyed.  Objects
like this can safely be referred to by other symbol table data
structure; there are a lot of cases of this in gdb.

Will>  source_cache::get_plain_source_lines (struct symtab *s,
Will>  				      const std::string &fullname)
Will>  {
Will> -  scoped_fd desc (open_source_file (s));
Will> -  if (desc.get () < 0)
Will> -    perror_with_name (symtab_to_filename_for_display (s), -desc.get ());
Will> +  std::string lines;

[...]

Will> +  else
Will> +    {
Will> +      lines = s->source;
Will> +    }

gdb doesn't use braces when there's just a single statemnt.

However, I suspect you will want a different factoring here.  This line
makes a copy of the entire source text -- but a copy isn't needed here.

Currently, get_plain_source_lines is written this way because it is
reading from the filesystem, so the data has to go somewhere.

Instead, what if the caller (source_cache::ensure) did the checking
here?  Then it could make a string_view for the text -- either the
saved source or the string returned by get_plain_source_lines.

Then try_source_highlight and ext_lang_colorize could be changed to
accept a string view instead.  (I'm not 100% sure this works, maybe just
passing a 'const char *' is better, it depends on if this code assumes
\0-termination.)

source_cache::source_text will also need some kind of update, but that
doesn't seem so bad either.

This approach should avoid copying as much as possible.  It's maybe not
possible to completely avoid it (I didn't look to see what the Pygments
call does); but at least we can avoid it where it's not needed.

Will> +  if (s->source != NULL)

We're slowly switching to nullptr.  There's a few cases of this.

Will> diff --git a/gdb/symtab.h b/gdb/symtab.h
Will> index bf9a3cfb79f..1f871ee0a74 100644
Will> --- a/gdb/symtab.h
Will> +++ b/gdb/symtab.h
Will> @@ -1755,6 +1755,8 @@ struct symtab
 
Will>    const char *filename;
 
Will> +  const char *source;

This should have a comment.  Older code may sometimes be missing one,
but we try to add one for anything new.

Will> +# Copyright 2022-2024 Free Software Foundation, Inc.

Probably should just say 2024.

Will> +set assign_m_line [gdb_get_line_number "main assign m"]
Will> +gdb_test "frame" ".*main \\\(\\\) at \[^\r\n\]*:$assign_m_line\r\n.*"
Will> +gdb_test "maintenance info symtabs missing-file.c" ".*source embedded in DWARF.*"
Will> +gdb_test "maintenance info line-table missing-file.c" ".*symtab: Source embedded in DWARF.*"
Will> +gdb_test "info source" ".*With embedded source.*"

Probably some of these lines need to be broken up.

Also I wonder if the output here is voluminous.  That can cause problems
unless you use gdb_test_multiple.

Will> diff --git a/include/dwarf2.h b/include/dwarf2.h
Will> index b3d3731ee83..bf8bdcd608f 100644
Will> --- a/include/dwarf2.h
Will> +++ b/include/dwarf2.h
Will> @@ -289,6 +289,7 @@ enum dwarf_line_number_content_type
Will>      DW_LNCT_size = 0x4,
Will>      DW_LNCT_MD5 = 0x5,
Will>      DW_LNCT_lo_user = 0x2000,
Will> +    DW_LNCT_LLVM_SOURCE = 0x2001,

I think this needs some kind of comment as well.  Maybe a link to
whatever spec or docs there are.  Look in dwarf2.{h,def} for "http" to
see some examples.

Tom
  
Will Hawkins April 3, 2024, 9:20 p.m. UTC | #2
On Wed, Apr 3, 2024 at 4:34 PM Tom Tromey <tom@tromey.com> wrote:
>
> >>>>> "Will" == Will Hawkins <hawkinsw@obs.cr> writes:
>
> Will> While DW_LNCT_source is not yet finalized in the DWARF standard
> Will> (https://dwarfstd.org/issues/180201.1.html), LLVM does emit it.
>
> Will> This patch adds support for it in gdb.
>
> Will> Tested on x86_64-redhat-linux.
>
> Will> Signed-off-by: Will Hawkins <hawkinsw@obs.cr>
>
> Thanks for the patch.
> It looks basically reasonable to me but I found a few oddities and then
> also some of the normal stylistic nits.
>

Thank you for the feedback! As I said for the other patch that I
submitted, I was nervous and hoping to hear some positive feedback.
And, I got it ... so mission accomplished for the day.

> Will> +      m_is_embedded (false)
>
> It's better to use inline initialization, like:
>
> Will> +  /* Whether the file's source is embedded in the dwarf.  */
> Will> +  bool m_is_embedded;
>
>    bool m_is_embedded = false;

Makes perfect sense. I obviously had a hard time deciphering the
prevailing style!

>
> Will> +     case DW_LNCT_LLVM_SOURCE:
> Will> +       if (string.has_value () && strlen (*string))
> Will> +         fe.source = *string;
> Will> +       break;
>
> If the source can be large then it seems bad to spend time computing the
> strlen, when just checking if the first byte is '\0' gives the same
> answer in constant time.

100% agree. Thank you!


>
> Will> +  /* Whether or not the file names refer to sources embedded
> Will> +     in the dwarf.  */
> Will> +  bool *embeddeds;
>
> Probably should be 'const bool *' since IIUC this is set up once and
> then never written to again.

Of course!

>
> Will> +       if (entry.source != NULL)
> Will> +         embeddeds.push_back (false);
> Will> +       else
> Will> +         embeddeds.push_back(true);
>
> embeddeds.push_back (entry.source == nullptr);

I have *no* idea why I didn't write that the first time.

>
> Will> +  qfn->embeddeds =
> Will> +    XOBNEWVEC (&per_objfile->per_bfd->obstack, bool,
>
> gdb puts '=' on the continuation line.

Sorry for not respecting the style here. I tried really hard to make
sure that I was doing the right thing!

>
> Will> + fullname.reset (embedded_fullname (dirname, qfn->file_names[index]));
>
> embedded_fullname should return a unique_xmalloc_ptr so that this spot
> can be an assignment.  Maybe one spot needs a '.release ()' then.
>
> It's better to return these self-managing objects, since it makes leaks
> less likely.

I agree. I went back and forth on what type embedded_fullname should
return. Thank you for the feedback.

>
> Will> +  /* Because the line header may tell us information about the CU
> Will> +     filename (e.g., whether it is embedded) which will affect other
> Will> +     calculations, we have to read that information here.  */
> Will> +  line_header_up lh;
> Will> +  lh.reset (cu->line_header);
>
> This will wind up deleting cu->line_header, which I assume is not
> intended.  For one thing, dwarf2_cu::line_header seems to have a kind of
> complicated ownership situation.
>
> In gdb the "_up" suffix on a type means "unique pointer"; i.e., 'lh'
> will delete the memory when its destructor is run.

Agree. I have no idea how I didn't realize that. I do really like the
`_up` convention. Makes it very easy to read!

>
> Will> +   const char *include_name =
> Will> +     compute_include_file_name (lh.get (), entry, res, name_holder);
>
> '=' on 2nd line.
>
> Will> +   if (!include_name )
>
> gdb tends to spell this out, like: include_name == nullptr

See above for the apology and I will correct!

>
> Will> + {
> Will> +   sf->symtab = allocate_symtab (cust, sf->name.c_str (),
> Will> +                                 sf->name_for_id.c_str ());
> Will> +   if (fe.source)
> Will> +     sf->symtab->source = cu->per_objfile->objfile->intern (fe.source);
>
> You don't want to intern the source here.  That makes a copy.  If the
> source is in one of the debug sections, those are mapped during DWARF
> reading and not unmapped until the per-BFD object is destroyed.  Objects
> like this can safely be referred to by other symbol table data
> structure; there are a lot of cases of this in gdb.

I am terribly sorry -- you must think that I completely ignored your
previous feedback where you said the same thing. I didn't mean to
imply that. I simply screwed up my git history. I promise that I did
read and process your previous response!

>
> Will>  source_cache::get_plain_source_lines (struct symtab *s,
> Will>                                 const std::string &fullname)
> Will>  {
> Will> -  scoped_fd desc (open_source_file (s));
> Will> -  if (desc.get () < 0)
> Will> -    perror_with_name (symtab_to_filename_for_display (s), -desc.get ());
> Will> +  std::string lines;
>
> [...]
>
> Will> +  else
> Will> +    {
> Will> +      lines = s->source;
> Will> +    }
>
> gdb doesn't use braces when there's just a single statemnt.
>
> However, I suspect you will want a different factoring here.  This line
> makes a copy of the entire source text -- but a copy isn't needed here.
>
> Currently, get_plain_source_lines is written this way because it is
> reading from the filesystem, so the data has to go somewhere.
>
> Instead, what if the caller (source_cache::ensure) did the checking
> here?  Then it could make a string_view for the text -- either the
> saved source or the string returned by get_plain_source_lines.
>
> Then try_source_highlight and ext_lang_colorize could be changed to
> accept a string view instead.  (I'm not 100% sure this works, maybe just
> passing a 'const char *' is better, it depends on if this code assumes
> \0-termination.)
>
> source_cache::source_text will also need some kind of update, but that
> doesn't seem so bad either.
>
> This approach should avoid copying as much as possible.  It's maybe not
> possible to completely avoid it (I didn't look to see what the Pygments
> call does); but at least we can avoid it where it's not needed.

Thank you! I will process this and attempt to make a good v3.

>
> Will> +  if (s->source != NULL)
>
> We're slowly switching to nullptr.  There's a few cases of this.

Ack.

>
> Will> diff --git a/gdb/symtab.h b/gdb/symtab.h
> Will> index bf9a3cfb79f..1f871ee0a74 100644
> Will> --- a/gdb/symtab.h
> Will> +++ b/gdb/symtab.h
> Will> @@ -1755,6 +1755,8 @@ struct symtab
>
> Will>    const char *filename;
>
> Will> +  const char *source;
>
> This should have a comment.  Older code may sometimes be missing one,
> but we try to add one for anything new.

I thought I got all the places that needed a comment. Sorry!


>
> Will> +# Copyright 2022-2024 Free Software Foundation, Inc.
>
> Probably should just say 2024.
>
> Will> +set assign_m_line [gdb_get_line_number "main assign m"]
> Will> +gdb_test "frame" ".*main \\\(\\\) at \[^\r\n\]*:$assign_m_line\r\n.*"
> Will> +gdb_test "maintenance info symtabs missing-file.c" ".*source embedded in DWARF.*"
> Will> +gdb_test "maintenance info line-table missing-file.c" ".*symtab: Source embedded in DWARF.*"
> Will> +gdb_test "info source" ".*With embedded source.*"
>
> Probably some of these lines need to be broken up.
>
> Also I wonder if the output here is voluminous.  That can cause problems
> unless you use gdb_test_multiple.

Interesting! I will look into that!

>
> Will> diff --git a/include/dwarf2.h b/include/dwarf2.h
> Will> index b3d3731ee83..bf8bdcd608f 100644
> Will> --- a/include/dwarf2.h
> Will> +++ b/include/dwarf2.h
> Will> @@ -289,6 +289,7 @@ enum dwarf_line_number_content_type
> Will>      DW_LNCT_size = 0x4,
> Will>      DW_LNCT_MD5 = 0x5,
> Will>      DW_LNCT_lo_user = 0x2000,
> Will> +    DW_LNCT_LLVM_SOURCE = 0x2001,
>
> I think this needs some kind of comment as well.  Maybe a link to
> whatever spec or docs there are.  Look in dwarf2.{h,def} for "http" to
> see some examples.

Absolutely will add a link to the same doc linked in the commit message.

Thank you, again, for the comments. I really appreciate you being so welcoming!

Sincerely,
Will

>
> Tom
  

Patch

diff --git a/gdb/dwarf2/file-and-dir.h b/gdb/dwarf2/file-and-dir.h
index a5b1d8a3a21..072e5256aa1 100644
--- a/gdb/dwarf2/file-and-dir.h
+++ b/gdb/dwarf2/file-and-dir.h
@@ -38,7 +38,8 @@  struct file_and_directory
 {
   file_and_directory (const char *name, const char *dir)
     : m_name (name),
-      m_comp_dir (dir)
+      m_comp_dir (dir),
+      m_is_embedded (false)
   {
   }
 
@@ -95,7 +96,14 @@  struct file_and_directory
   const char *get_fullname ()
   {
     if (m_fullname == nullptr)
-      m_fullname = find_source_or_rewrite (get_name (), get_comp_dir ());
+      {
+	if (m_is_embedded)
+	  m_fullname = make_unique_xstrdup (embedded_fullname (
+							       get_name (),
+							       get_comp_dir ()));
+	else
+	  m_fullname = find_source_or_rewrite (get_name (), get_comp_dir ());
+      }
     return m_fullname.get ();
   }
 
@@ -105,6 +113,17 @@  struct file_and_directory
     m_fullname.reset ();
   }
 
+  /* Set whether the file's source is embedded in the dwarf.  */
+  void set_embedded (bool is_embedded)
+  {
+    m_is_embedded = is_embedded;
+  }
+
+  /* Return true if the file's source is embedded in the dwarf.  */
+  bool is_embedded () const
+  {
+    return m_is_embedded;
+  }
 private:
 
   /* The filename.  */
@@ -124,6 +143,9 @@  struct file_and_directory
 
   /* The full name.  */
   gdb::unique_xmalloc_ptr<char> m_fullname;
+
+  /* Whether the file's source is embedded in the dwarf.  */
+  bool m_is_embedded;
 };
 
 #endif /* GDB_DWARF2_FILE_AND_DIR_H */
diff --git a/gdb/dwarf2/line-header.c b/gdb/dwarf2/line-header.c
index a3ca49b64f5..de6c31d877d 100644
--- a/gdb/dwarf2/line-header.c
+++ b/gdb/dwarf2/line-header.c
@@ -45,6 +45,7 @@  line_header::add_include_dir (const char *include_dir)
 void
 line_header::add_file_name (const char *name,
 			    dir_index d_index,
+			    const char *source,
 			    unsigned int mod_time,
 			    unsigned int length)
 {
@@ -54,7 +55,7 @@  line_header::add_file_name (const char *name,
   if (dwarf_line_debug >= 2)
     gdb_printf (gdb_stdlog, "Adding file %d: %s\n", index, name);
 
-  m_file_names.emplace_back (name, index, d_index, mod_time, length);
+  m_file_names.emplace_back (name, index, d_index, source, mod_time, length);
 }
 
 std::string
@@ -125,6 +126,7 @@  read_formatted_entries (dwarf2_per_objfile *per_objfile, bfd *abfd,
 			void (*callback) (struct line_header *lh,
 					  const char *name,
 					  dir_index d_index,
+					  const char *source,
 					  unsigned int mod_time,
 					  unsigned int length))
 {
@@ -239,13 +241,17 @@  read_formatted_entries (dwarf2_per_objfile *per_objfile, bfd *abfd,
 	      break;
 	    case DW_LNCT_MD5:
 	      break;
+	    case DW_LNCT_LLVM_SOURCE:
+	      if (string.has_value () && strlen (*string))
+		fe.source = *string;
+	      break;
 	    default:
 	      complaint (_("Unknown format content type %s"),
 			 pulongest (content_type));
 	    }
 	}
 
-      callback (lh, fe.name, fe.d_index, fe.mod_time, fe.length);
+      callback (lh, fe.name, fe.d_index, fe.source, fe.mod_time, fe.length);
     }
 
   *bufp = buf;
@@ -368,8 +374,8 @@  dwarf_decode_line_header  (sect_offset sect_off, bool is_dwz,
       read_formatted_entries (per_objfile, abfd, &line_ptr, lh.get (),
 			      offset_size,
 			      [] (struct line_header *header, const char *name,
-				  dir_index d_index, unsigned int mod_time,
-				  unsigned int length)
+				  dir_index d_index, const char *source,
+				  unsigned int mod_time, unsigned int length)
 	{
 	  header->add_include_dir (name);
 	});
@@ -378,10 +384,10 @@  dwarf_decode_line_header  (sect_offset sect_off, bool is_dwz,
       read_formatted_entries (per_objfile, abfd, &line_ptr, lh.get (),
 			      offset_size,
 			      [] (struct line_header *header, const char *name,
-				  dir_index d_index, unsigned int mod_time,
-				  unsigned int length)
+				  dir_index d_index, const char *source,
+				  unsigned int mod_time, unsigned int length)
 	{
-	  header->add_file_name (name, d_index, mod_time, length);
+	  header->add_file_name (name, d_index, source, mod_time, length);
 	});
     }
   else
@@ -408,7 +414,7 @@  dwarf_decode_line_header  (sect_offset sect_off, bool is_dwz,
 	  length = read_unsigned_leb128 (abfd, line_ptr, &bytes_read);
 	  line_ptr += bytes_read;
 
-	  lh->add_file_name (cur_file, d_index, mod_time, length);
+	  lh->add_file_name (cur_file, d_index, nullptr, mod_time, length);
 	}
       line_ptr += bytes_read;
     }
diff --git a/gdb/dwarf2/line-header.h b/gdb/dwarf2/line-header.h
index c068dff70a3..abc95f3ee87 100644
--- a/gdb/dwarf2/line-header.h
+++ b/gdb/dwarf2/line-header.h
@@ -35,9 +35,10 @@  struct file_entry
   file_entry () = default;
 
   file_entry (const char *name_, file_name_index index_, dir_index d_index_,
-	      unsigned int mod_time_, unsigned int length_)
+	      const char *source_, unsigned int mod_time_, unsigned int length_)
     : name (name_),
       index (index_),
+      source (source_),
       d_index (d_index_),
       mod_time (mod_time_),
       length (length_)
@@ -54,6 +55,10 @@  struct file_entry
   /* The index of this file in the file table.  */
   file_name_index index {};
 
+  /* The file's contents (if not null).  Note this is an observing pointer.
+     The memory is owned by debug_line_buffer.  */
+  const char *source {};
+
   /* The directory index (1-based).  */
   dir_index d_index {};
 
@@ -88,7 +93,7 @@  struct line_header
   void add_include_dir (const char *include_dir);
 
   /* Add an entry to the file name table.  */
-  void add_file_name (const char *name, dir_index d_index,
+  void add_file_name (const char *name, dir_index d_index, const char *source,
 		      unsigned int mod_time, unsigned int length);
 
   /* Return the include dir at INDEX (0-based in DWARF 5 and 1-based before).
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 7442094874c..7687a1f9a01 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -1635,6 +1635,10 @@  struct quick_file_names
   /* The file names from the line table after being run through
      gdb_realpath.  These are computed lazily.  */
   const char **real_names;
+
+  /* Whether or not the file names refer to sources embedded
+     in the dwarf.  */
+  bool *embeddeds;
 };
 
 /* With OBJF_READNOW, the DWARF reader expands all CUs immediately.
@@ -1908,6 +1912,10 @@  dw2_get_file_names_reader (const struct die_reader_specs *reader,
   if (slot != nullptr)
     *slot = qfn;
 
+
+  bool cu_file_embedded = false;
+  std::vector<bool> embeddeds;
+
   std::vector<const char *> include_names;
   if (lh != nullptr)
     {
@@ -1920,7 +1928,18 @@  dw2_get_file_names_reader (const struct die_reader_specs *reader,
 	    {
 	      include_name = per_objfile->objfile->intern (include_name);
 	      include_names.push_back (include_name);
+	      if (entry.source != NULL)
+		embeddeds.push_back (false);
+	      else
+		embeddeds.push_back(true);
 	    }
+	  else if (entry.source != NULL)
+	    {
+	      /* We have an embedded source for the CU.  */
+	      gdb_assert (offset == 1);
+	      cu_file_embedded = true;
+	    }
+
 	}
     }
 
@@ -1936,6 +1955,14 @@  dw2_get_file_names_reader (const struct die_reader_specs *reader,
     memcpy (&qfn->file_names[offset], include_names.data (),
 	    include_names.size () * sizeof (const char *));
 
+  qfn->embeddeds =
+    XOBNEWVEC (&per_objfile->per_bfd->obstack, bool,
+	       qfn->num_file_names);
+  if (cu_file_embedded)
+    qfn->embeddeds[0] = true;
+  for (size_t i = 0; i < embeddeds.size (); i++)
+    qfn->embeddeds[offset + i] = embeddeds[i];
+
   qfn->real_names = NULL;
 
   lh_cu->file_names = qfn;
@@ -1980,7 +2007,11 @@  dw2_get_real_path (dwarf2_per_objfile *per_objfile,
 	dirname = qfn->comp_dir;
 
       gdb::unique_xmalloc_ptr<char> fullname;
-      fullname = find_source_or_rewrite (qfn->file_names[index], dirname);
+
+      if (qfn->embeddeds[index])
+	fullname.reset (embedded_fullname (dirname, qfn->file_names[index]));
+      else
+	fullname = find_source_or_rewrite (qfn->file_names[index], dirname);
 
       qfn->real_names[index] = fullname.release ();
     }
@@ -7311,6 +7342,34 @@  find_file_and_directory (struct die_info *die, struct dwarf2_cu *cu)
   file_and_directory res (dwarf2_string_attr (die, DW_AT_name, cu),
 			  dwarf2_string_attr (die, DW_AT_comp_dir, cu));
 
+  /* Because the line header may tell us information about the CU
+     filename (e.g., whether it is embedded) which will affect other
+     calculations, we have to read that information here.  */
+  line_header_up lh;
+  lh.reset (cu->line_header);
+  struct attribute *attr = dwarf2_attr (die, DW_AT_stmt_list, cu);
+  if (lh == nullptr && attr != nullptr && attr->form_is_unsigned ())
+    {
+      sect_offset line_offset = (sect_offset) attr->as_unsigned ();
+      lh = dwarf_decode_line_header (line_offset, cu,
+				     res.get_comp_dir ());
+    }
+
+  if (lh != nullptr)
+    {
+      for (const auto &entry : lh->file_names ())
+	{
+	  if (entry.source == NULL)
+	    continue;
+
+	  std::string name_holder;
+	  const char *include_name =
+	    compute_include_file_name (lh.get (), entry, res, name_holder);
+	  if (!include_name )
+	    res.set_embedded (true);
+	}
+    }
+
   if (res.get_comp_dir () == nullptr
       && producer_is_gcc_lt_4_3 (cu)
       && res.get_name () != nullptr
@@ -18448,7 +18507,7 @@  dwarf_decode_lines_1 (struct line_header *lh, struct dwarf2_cu *cu,
 		    length =
 		      read_unsigned_leb128 (abfd, line_ptr, &bytes_read);
 		    line_ptr += bytes_read;
-		    lh->add_file_name (cur_file, dindex, mod_time, length);
+		    lh->add_file_name (cur_file, dindex, nullptr, mod_time, length);
 		  }
 		  break;
 		case DW_LNE_set_discriminator:
@@ -18603,9 +18662,12 @@  dwarf_decode_lines (struct line_header *lh, struct dwarf2_cu *cu,
       subfile *sf = builder->get_current_subfile ();
 
       if (sf->symtab == nullptr)
-	sf->symtab = allocate_symtab (cust, sf->name.c_str (),
-				      sf->name_for_id.c_str ());
-
+	{
+	  sf->symtab = allocate_symtab (cust, sf->name.c_str (),
+					sf->name_for_id.c_str ());
+	  if (fe.source)
+	    sf->symtab->source = cu->per_objfile->objfile->intern (fe.source);
+	}
       fe.symtab = sf->symtab;
     }
 }
diff --git a/gdb/source-cache.c b/gdb/source-cache.c
index 8b5bd84d19a..4075772a40b 100644
--- a/gdb/source-cache.c
+++ b/gdb/source-cache.c
@@ -97,28 +97,36 @@  std::string
 source_cache::get_plain_source_lines (struct symtab *s,
 				      const std::string &fullname)
 {
-  scoped_fd desc (open_source_file (s));
-  if (desc.get () < 0)
-    perror_with_name (symtab_to_filename_for_display (s), -desc.get ());
+  std::string lines;
+  if (!s->source)
+    {
 
-  struct stat st;
-  if (fstat (desc.get (), &st) < 0)
-    perror_with_name (symtab_to_filename_for_display (s));
+      scoped_fd desc (open_source_file (s));
+      if (desc.get () < 0)
+	perror_with_name (symtab_to_filename_for_display (s), -desc.get ());
 
-  std::string lines;
-  lines.resize (st.st_size);
-  if (myread (desc.get (), &lines[0], lines.size ()) < 0)
-    perror_with_name (symtab_to_filename_for_display (s));
-
-  time_t mtime = 0;
-  if (s->compunit ()->objfile () != NULL
-      && s->compunit ()->objfile ()->obfd != NULL)
-    mtime = s->compunit ()->objfile ()->mtime;
-  else if (current_program_space->exec_bfd ())
-    mtime = current_program_space->ebfd_mtime;
-
-  if (mtime && mtime < st.st_mtime)
-    warning (_("Source file is more recent than executable."));
+      struct stat st;
+      if (fstat (desc.get (), &st) < 0)
+	perror_with_name (symtab_to_filename_for_display (s));
+
+      lines.resize (st.st_size);
+      if (myread (desc.get (), &lines[0], lines.size ()) < 0)
+	perror_with_name (symtab_to_filename_for_display (s));
+
+      time_t mtime = 0;
+      if (s->compunit ()->objfile () != NULL
+	  && s->compunit ()->objfile ()->obfd != NULL)
+	mtime = s->compunit ()->objfile ()->mtime;
+      else if (current_program_space->exec_bfd ())
+	mtime = current_program_space->ebfd_mtime;
+
+      if (mtime && mtime < st.st_mtime)
+	warning (_("Source file is more recent than executable."));
+    }
+  else
+    {
+      lines = s->source;
+    }
 
   std::vector<off_t> offsets;
   offsets.push_back (0);
diff --git a/gdb/source.c b/gdb/source.c
index bbeb4154258..30ea8433e24 100644
--- a/gdb/source.c
+++ b/gdb/source.c
@@ -686,7 +686,9 @@  info_source_command (const char *ignore, int from_tty)
   gdb_printf (_("Current source file is %s\n"), s->filename);
   if (s->compunit ()->dirname () != NULL)
     gdb_printf (_("Compilation directory is %s\n"), s->compunit ()->dirname ());
-  if (s->fullname)
+  if (s->source != NULL)
+    gdb_printf (_("With embedded source.\n"));
+  else if (s->fullname)
     gdb_printf (_("Located in %s\n"), s->fullname);
   const std::vector<off_t> *offsets;
   if (g_source_cache.get_line_charpos (s, &offsets))
@@ -960,6 +962,19 @@  source_full_path_of (const char *filename,
   return 1;
 }
 
+/* See source.h.  */
+
+char *
+embedded_fullname (const char *dirname, const char *filename)
+{
+  if (dirname != NULL)
+    {
+      return concat (dirname, SLASH_STRING, filename, (char *) nullptr);
+    }
+
+  return xstrdup (filename);
+}
+
 /* Return non-zero if RULE matches PATH, that is if the rule can be
    applied to PATH.  */
 
@@ -1239,25 +1254,33 @@  symtab_to_fullname (struct symtab *s)
      to handle cases like the file being moved.  */
   if (s->fullname == NULL)
     {
-      scoped_fd fd = open_source_file (s);
-
-      if (fd.get () < 0)
+      if (s->source)
+	  s->fullname = embedded_fullname (s->compunit ()->dirname (),
+					   s->filename);
+      else
 	{
-	  gdb::unique_xmalloc_ptr<char> fullname;
+	  scoped_fd fd = open_source_file (s);
 
-	  /* rewrite_source_path would be applied by find_and_open_source, we
-	     should report the pathname where GDB tried to find the file.  */
+	  if (fd.get () < 0)
+	    {
+	      gdb::unique_xmalloc_ptr<char> fullname;
 
-	  if (s->compunit ()->dirname () == nullptr
-	      || IS_ABSOLUTE_PATH (s->filename))
-	    fullname.reset (xstrdup (s->filename));
-	  else
-	    fullname.reset (concat (s->compunit ()->dirname (), SLASH_STRING,
-				    s->filename, (char *) NULL));
+	      /* rewrite_source_path would be applied by find_and_open_source,
+		 we should report the pathname where GDB tried to find the
+		 file.  */
+
+	      if (s->compunit ()->dirname () == nullptr
+		  || IS_ABSOLUTE_PATH (s->filename))
+		fullname.reset (xstrdup (s->filename));
+	      else
+		fullname.reset (concat (s->compunit ()->dirname (),
+					SLASH_STRING, s->filename,
+					(char *) NULL));
 
-	  s->fullname = rewrite_source_path (fullname.get ()).release ();
-	  if (s->fullname == NULL)
-	    s->fullname = fullname.release ();
+	      s->fullname = rewrite_source_path (fullname.get ()).release ();
+	      if (s->fullname == NULL)
+		s->fullname = fullname.release ();
+	    }
 	}
     } 
 
@@ -1317,12 +1340,17 @@  print_source_lines_base (struct symtab *s, int line, int stopline,
       else
 	{
 	  last_source_visited = s;
-	  scoped_fd desc = open_source_file (s);
-	  last_source_error = desc.get () < 0;
-	  if (last_source_error)
+	  /* Do not attempt to open a source file for a symtab
+	     with an embedded source.  */
+	  if (!s->source)
 	    {
-	      noprint = true;
-	      errcode = -desc.get ();
+	      scoped_fd desc = open_source_file (s);
+	      last_source_error = desc.get () < 0;
+	      if (last_source_error)
+		{
+		  noprint = true;
+		  errcode = -desc.get ();
+		}
 	    }
 	}
     }
diff --git a/gdb/source.h b/gdb/source.h
index 144ee48f722..f27caf1a9b5 100644
--- a/gdb/source.h
+++ b/gdb/source.h
@@ -216,4 +216,8 @@  extern void forget_cached_source_info (void);
    need to would make things slower than necessary.  */
 extern void select_source_symtab ();
 
+/* Compute the fullname for a source file whose source is embedded
+   in the dwarf file.  */
+extern char *embedded_fullname (const char *dirname,
+				const char *filename);
 #endif
diff --git a/gdb/symmisc.c b/gdb/symmisc.c
index 49b9674f77a..8385e176a3e 100644
--- a/gdb/symmisc.c
+++ b/gdb/symmisc.c
@@ -813,6 +813,8 @@  maintenance_info_symtabs (const char *regexp, int from_tty)
 				symtab->fullname != NULL
 				? symtab->fullname
 				: "(null)");
+		    if (symtab->source != NULL)
+		      gdb_printf ("\t  source embedded in DWARF\n");
 		    gdb_printf ("\t  "
 				"linetable ((struct linetable *) %s)\n",
 				host_address_to_string
@@ -955,14 +957,19 @@  maintenance_print_one_line_table (struct symtab *symtab, void *data)
   gdb_printf (_("compunit_symtab: %s ((struct compunit_symtab *) %s)\n"),
 	      symtab->compunit ()->name,
 	      host_address_to_string (symtab->compunit ()));
+  styled_string_s *styled_symtab_fullname = NULL;
+  if (symtab->source)
+    styled_symtab_fullname = styled_string (metadata_style.style (),
+					    _("Source embedded in DWARF"));
+  else
+    styled_symtab_fullname = styled_string (file_name_style.style (),
+					    symtab_to_fullname (symtab));
   gdb_printf (_("symtab: %ps ((struct symtab *) %s)\n"),
-	      styled_string (file_name_style.style (),
-			     symtab_to_fullname (symtab)),
+	      styled_symtab_fullname,
 	      host_address_to_string (symtab));
   linetable = symtab->linetable ();
   gdb_printf (_("linetable: ((struct linetable *) %s):\n"),
 	      host_address_to_string (linetable));
-
   if (linetable == NULL)
     gdb_printf (_("No line table.\n"));
   else if (linetable->nitems <= 0)
diff --git a/gdb/symtab.h b/gdb/symtab.h
index bf9a3cfb79f..1f871ee0a74 100644
--- a/gdb/symtab.h
+++ b/gdb/symtab.h
@@ -1755,6 +1755,8 @@  struct symtab
 
   const char *filename;
 
+  const char *source;
+
   /* Filename for this source file, used as an identifier to link with
      related objects such as associated macro_source_file objects.  It must
      therefore match the name of any macro_source_file object created for this
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c b/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c
new file mode 100644
index 00000000000..6c03c84d4e7
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c
@@ -0,0 +1,24 @@ 
+/* Copyright 2022-2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+
+int
+main (void)
+{							/* main prologue */
+  asm ("main_label: .global main_label");
+  int m = 42;						/* main assign m */
+  asm ("main_end: .global main_end");			/* main end */
+  return m;
+}
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp b/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp
new file mode 100644
index 00000000000..505f5cfb937
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp
@@ -0,0 +1,88 @@ 
+# Copyright 2022-2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Check that GDB can honor LNCT_llvm_SOURCE.
+
+load_lib dwarf.exp
+
+# This test can only be run on targets which support DWARF-2 and use gas.
+require dwarf2_support
+
+standard_testfile .c .S
+
+set asm_file [standard_output_file $srcfile2]
+
+set fp [open "${srcdir}/${subdir}/${srcfile}" r]
+set srcfile_data [read $fp]
+close $fp
+
+Dwarf::assemble $asm_file {
+    global srcdir subdir srcfile srcfile2 srcfile_data
+    declare_labels lines_label
+
+    get_func_info main
+
+    cu {} {
+	compile_unit {
+	    {language @DW_LANG_C}
+	    {name missing-file.c}
+	    {stmt_list ${lines_label} DW_FORM_sec_offset}
+	} {
+	    subprogram {
+		{external 1 flag}
+		{name main}
+		{low_pc $main_start addr}
+		{high_pc "$main_start + $main_len" addr}
+	    }
+	}
+    }
+
+    lines {version 5} lines_label {
+	set diridx [include_dir "${srcdir}/${subdir}"]
+	file_name "missing-file.c" $diridx "${srcfile_data}"
+
+	program {
+	    DW_LNS_set_file $diridx
+	    DW_LNE_set_address $main_start
+	    line [gdb_get_line_number "main prologue"]
+	    DW_LNS_copy
+
+	    DW_LNE_set_address main_label
+	    line [gdb_get_line_number "main assign m"]
+	    DW_LNS_copy
+
+	    DW_LNE_set_address main_end
+	    line [gdb_get_line_number "main end"]
+	    DW_LNS_copy
+
+	    DW_LNE_end_sequence
+	}
+    }
+}
+
+if { [prepare_for_testing "failed to prepare" ${testfile} \
+	  [list $srcfile $asm_file] {nodebug}] } {
+    return -1
+}
+
+if ![runto_main] {
+    return -1
+}
+
+set assign_m_line [gdb_get_line_number "main assign m"]
+gdb_test "frame" ".*main \\\(\\\) at \[^\r\n\]*:$assign_m_line\r\n.*"
+gdb_test "maintenance info symtabs missing-file.c" ".*source embedded in DWARF.*"
+gdb_test "maintenance info line-table missing-file.c" ".*symtab: Source embedded in DWARF.*"
+gdb_test "info source" ".*With embedded source.*"
diff --git a/gdb/testsuite/lib/dwarf.exp b/gdb/testsuite/lib/dwarf.exp
index d085f835f07..b1f09eab3d3 100644
--- a/gdb/testsuite/lib/dwarf.exp
+++ b/gdb/testsuite/lib/dwarf.exp
@@ -2423,9 +2423,9 @@  namespace eval Dwarf {
 	# Add a file name entry to the line table header's file names table.
 	#
 	# Return the index by which this entry can be referred to.
-	proc file_name {filename diridx} {
+	proc file_name {filename diridx { source "" } } {
 	    variable _line_file_names
-	    lappend _line_file_names $filename $diridx
+	    lappend _line_file_names $filename $diridx $source
 
 	    if { $Dwarf::_line_unit_version >= 5 } {
 		return [expr [llength $_line_file_names] - 1]
@@ -2481,7 +2481,7 @@  namespace eval Dwarf {
 		    }
 		}
 
-		_op .byte 2 "file_name_entry_format_count"
+		_op .byte 3 "file_name_entry_format_count"
 		_op .uleb128 1 \
 		    "file_name_entry_format (content type code: DW_LNCT_path)"
 		switch $_line_string_form {
@@ -2494,15 +2494,21 @@  namespace eval Dwarf {
 			    "directory_entry_format (form: DW_FORM_line_strp)"
 		    }
 		}
+
 		_op .uleb128 2 \
 		    "file_name_entry_format (content type code: DW_LNCT_directory_index)"
 		_op .uleb128 0x0f \
 		    "file_name_entry_format (form: DW_FORM_udata)"
 
-		set nr_files [expr [llength $_line_file_names] / 2]
+		_op .uleb128 0x2001 \
+		    "file_name_entry_format (content type code: DW_LNCT_LLVM_SOURCE)"
+		_op .uleb128 0x08 \
+		    "file_name_entry_format (form: DW_FORM_string)"
+
+		set nr_files [expr [llength $_line_file_names] / 3]
 		_op .byte $nr_files "file_names_count"
 
-		foreach { filename diridx } $_line_file_names {
+		foreach { filename diridx source } $_line_file_names {
 		    switch $_line_string_form {
 			string {
 			    _op .ascii [_quote $filename]
@@ -2517,6 +2523,7 @@  namespace eval Dwarf {
 			}
 		    }
 		    _op .uleb128 $diridx
+		    _op .ascii [_quote [string map { "\"" "\\\"" "\n" "\\n" } $source]]
 		}
 	    } else {
 		foreach dirname $_line_include_dirs {
@@ -2525,7 +2532,7 @@  namespace eval Dwarf {
 
 		_op .byte 0 "Terminator (include_directories)"
 
-		foreach { filename diridx } $_line_file_names {
+		foreach { filename diridx source } $_line_file_names {
 		    _op .ascii [_quote $filename]
 		    _op .sleb128 $diridx
 		    _op .sleb128 0 "mtime"
diff --git a/include/dwarf2.h b/include/dwarf2.h
index b3d3731ee83..bf8bdcd608f 100644
--- a/include/dwarf2.h
+++ b/include/dwarf2.h
@@ -289,6 +289,7 @@  enum dwarf_line_number_content_type
     DW_LNCT_size = 0x4,
     DW_LNCT_MD5 = 0x5,
     DW_LNCT_lo_user = 0x2000,
+    DW_LNCT_LLVM_SOURCE = 0x2001,
     DW_LNCT_hi_user = 0x3fff
   };