[v3] bfd: fill in PE load config directory entry.

Message ID e2ba2d4e-95f7-4d21-725c-47df616c3efc@jdrake.com
State New
Headers
Series [v3] bfd: fill in PE load config directory entry. |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_binutils_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 success Test passed

Commit Message

Jeremy Drake March 13, 2025, 5:14 p.m. UTC
  This is filled in with the rva of _load_config_used if defined (much
like _tls_used), and the size is the first 32-bit value at that symbol.

Signed-off-by: Jeremy Drake <sourceware-bugzilla@jdrake.com>
---
 bfd/peXXigen.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 46 insertions(+), 2 deletions(-)
  

Comments

Jan Beulich March 20, 2025, 11:31 a.m. UTC | #1
On 13.03.2025 18:14, Jeremy Drake wrote:
> @@ -4573,6 +4575,48 @@ _bfd_XXi_final_link_postscript (bfd * abfd, struct coff_final_link_info *pfinfo)
>  #endif
>      }
> 
> +  h1 = coff_link_hash_lookup (coff_hash_table (info),
> +			      (bfd_get_symbol_leading_char (abfd) != 0
> +			       ? "__load_config_used" : "_load_config_used"),

Still assuming bfd_get_symbol_leading_char() can only ever return nil or '_'?

> +			      false, false, true);
> +  if (h1 != NULL)
> +    {
> +      char data[4];
> +      if ((h1->root.type == bfd_link_hash_defined
> +	   || h1->root.type == bfd_link_hash_defweak)
> +	  && h1->root.u.def.section != NULL
> +	  && h1->root.u.def.section->output_section != NULL)
> +	{
> +	  pe_data (abfd)->pe_opthdr.DataDirectory[PE_LOAD_CONFIG_TABLE].VirtualAddress =
> +	    (h1->root.u.def.value
> +	     + h1->root.u.def.section->output_section->vma
> +	     + h1->root.u.def.section->output_offset
> +	     - pe_data (abfd)->pe_opthdr.ImageBase);
> +
> +	  /* the size is stored as the first 4 bytes at _load_config_used */
> +	  if (bfd_get_section_contents (abfd,
> +		h1->root.u.def.section->output_section, data,
> +		h1->root.u.def.section->output_offset + h1->root.u.def.value,
> +		4))
> +	    pe_data (abfd)->pe_opthdr.DataDirectory[PE_LOAD_CONFIG_TABLE].Size =
> +	      bfd_get_32 (abfd, data);

I wonder if we can indeed blindly use that value, with no sanity checking
whatsoever.

Furthermore, having checked two random executables each from two random VC
installations, I find that the size in the data directory and the value at
the start of the referenced RVA aren't the same: In all four cases the data
directory says 0x40 while the 32-bit item at the indicated RVA is 0x48.

Jan
  
Jeremy Drake March 20, 2025, 6:22 p.m. UTC | #2
On Thu, 20 Mar 2025, Jan Beulich wrote:

> On 13.03.2025 18:14, Jeremy Drake wrote:
> > @@ -4573,6 +4575,48 @@ _bfd_XXi_final_link_postscript (bfd * abfd, struct coff_final_link_info *pfinfo)
> >  #endif
> >      }
> >
> > +  h1 = coff_link_hash_lookup (coff_hash_table (info),
> > +			      (bfd_get_symbol_leading_char (abfd) != 0
> > +			       ? "__load_config_used" : "_load_config_used"),
>
> Still assuming bfd_get_symbol_leading_char() can only ever return nil or '_'?

Yes:
> Nevertheless, because of the pre-existing similar code, I'm not going to
> insist that this be corrected right here.

> I wonder if we can indeed blindly use that value, with no sanity checking
> whatsoever.
>
> Furthermore, having checked two random executables each from two random VC
> installations, I find that the size in the data directory and the value at
> the start of the referenced RVA aren't the same: In all four cases the data
> directory says 0x40 while the 32-bit item at the indicated RVA is 0x48.

That's odd.  This matches what LLD does (though they do indeed do some
sanity checking):
https://github.com/llvm/llvm-project/blob/49f06075a6d298bd13564c9bffcf51281bed4962/lld/COFF/SymbolTable.cpp#L576
  
Jan Beulich March 21, 2025, 7:04 a.m. UTC | #3
On 20.03.2025 19:22, Jeremy Drake wrote:
> On Thu, 20 Mar 2025, Jan Beulich wrote:
>> On 13.03.2025 18:14, Jeremy Drake wrote:
>>> @@ -4573,6 +4575,48 @@ _bfd_XXi_final_link_postscript (bfd * abfd, struct coff_final_link_info *pfinfo)
>>>  #endif
>>>      }
>>>
>>> +  h1 = coff_link_hash_lookup (coff_hash_table (info),
>>> +			      (bfd_get_symbol_leading_char (abfd) != 0
>>> +			       ? "__load_config_used" : "_load_config_used"),
>>
>> Still assuming bfd_get_symbol_leading_char() can only ever return nil or '_'?
> 
> Yes:
>> Nevertheless, because of the pre-existing similar code, I'm not going to
>> insist that this be corrected right here.

Oh, right, I said that earlier on. A remark towards the limitation in the
commit message would maybe help.

>> I wonder if we can indeed blindly use that value, with no sanity checking
>> whatsoever.
>>
>> Furthermore, having checked two random executables each from two random VC
>> installations, I find that the size in the data directory and the value at
>> the start of the referenced RVA aren't the same: In all four cases the data
>> directory says 0x40 while the 32-bit item at the indicated RVA is 0x48.
> 
> That's odd.  This matches what LLD does (though they do indeed do some
> sanity checking):
> https://github.com/llvm/llvm-project/blob/49f06075a6d298bd13564c9bffcf51281bed4962/lld/COFF/SymbolTable.cpp#L576

As one more data point, a quote from MS doc: "For Windows XP, the size must
be specified as 64 for x86 images." It looks odd to me though that the exact
same size is stated for both the 32- and 64-bit struct.

Since hopefully hardly anyone still cares about XP, doing what you do (and
what llvm does) is likely fine, but imo wants to come with a code comment
then.

Jan
  
LIU Hao March 21, 2025, 7:43 a.m. UTC | #4
在 2025-3-21 15:04, Jan Beulich 写道:
>> That's odd.  This matches what LLD does (though they do indeed do some
>> sanity checking):
>> https://github.com/llvm/llvm-project/blob/49f06075a6d298bd13564c9bffcf51281bed4962/lld/COFF/SymbolTable.cpp#L576
> 
> As one more data point, a quote from MS doc: "For Windows XP, the size must
> be specified as 64 for x86 images." It looks odd to me though that the exact
> same size is stated for both the 32- and 64-bit struct.
> 
> Since hopefully hardly anyone still cares about XP, doing what you do (and
> what llvm does) is likely fine, but imo wants to come with a code comment
> then.

If we would like to follow the specification [1], we can check the subsystem version. It specifies the 
Windows kernel version when the subsystem is `console` or windows`, which is

    * `5.1` for Windows XP x86-32;
    * `5.2` for Windows XP x86-64 and Windows Server 2003;
    * `6.0` for Windows Vista and Windows Server 2008 (default for x86);
    * `6.1` for Windows 7 and Windows Server 2008 R2.
    * `6.2` for Windows 8 and Windows Server 2012 (default for ARM).

So hard-coding the value to 64 when subsystem equals 5.1 makes some sense. LLVM seems to have a default 
value of 6.0, and in a sense they don't truly support XP.


[1] https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#load-configuration-directory




-- 
Best regards,
LIU Hao
  
LIU Hao March 22, 2025, 4:02 p.m. UTC | #5
在 2025-3-21 15:04, Jan Beulich 写道:
> On 20.03.2025 19:22, Jeremy Drake wrote:
>> On Thu, 20 Mar 2025, Jan Beulich wrote:
>>> Furthermore, having checked two random executables each from two random VC
>>> installations, I find that the size in the data directory and the value at
>>> the start of the referenced RVA aren't the same: In all four cases the data
>>> directory says 0x40 while the 32-bit item at the indicated RVA is 0x48.
>>
>> That's odd.  This matches what LLD does (though they do indeed do some
>> sanity checking):
>> https://github.com/llvm/llvm-project/blob/49f06075a6d298bd13564c9bffcf51281bed4962/lld/COFF/SymbolTable.cpp#L576
> 
> As one more data point, a quote from MS doc: "For Windows XP, the size must
> be specified as 64 for x86 images." It looks odd to me though that the exact
> same size is stated for both the 32- and 64-bit struct.
> 
> Since hopefully hardly anyone still cares about XP, doing what you do (and
> what llvm does) is likely fine, but imo wants to come with a code comment
> then.

There's quite a little information floating around on the Internet. Most of the materials about reverse 
engineering are in Chinese however I have found an English one which you might have a little interest [1].

It looks like SafeSEH was introduced in Visual Studio 2003 and became the default in Visual Studio 2005. 
If that's true then it's predated by Windows XP. So when XP was released, the load configure directory 
must have had a size of 64 (0x40). When SafeSEH was implemented in the operating system (around XP SP2, I 
suspect) two fields (`SEHandlerTable` and `SEHandlerCount`) were added, increasing the size of 
`IMAGE_LOAD_CONFIG_DIRECTORY32` to 72 (0x48), but perhaps the DLL loader was not updated and it was 
checking for 64, but the structure always has a `Size` field with value 72, so it must be a special case 
for the loader to handle that.


[1] 
https://www.cyberark.com/resources/threat-research-blog/a-modern-exploration-of-windows-memory-corruption-exploits-part-i-stack-overflows


-- 
Best regards,
LIU Hao
  
Jeremy Drake March 22, 2025, 11:26 p.m. UTC | #6
I might not be able to get right on this right now, I've got some other
stuff going on, but I want to memorialize my understanding at least.

On Fri, 21 Mar 2025, Jan Beulich wrote:

> On 20.03.2025 19:22, Jeremy Drake wrote:
> > On Thu, 20 Mar 2025, Jan Beulich wrote:
> >> Still assuming bfd_get_symbol_leading_char() can only ever return nil or '_'?
> >
> > Yes:
> >> Nevertheless, because of the pre-existing similar code, I'm not going to
> >> insist that this be corrected right here.
>
> Oh, right, I said that earlier on. A remark towards the limitation in the
> commit message would maybe help.
>
> >> I wonder if we can indeed blindly use that value, with no sanity checking
> >> whatsoever.
> >>
> >> Furthermore, having checked two random executables each from two random VC
> >> installations, I find that the size in the data directory and the value at
> >> the start of the referenced RVA aren't the same: In all four cases the data
> >> directory says 0x40 while the 32-bit item at the indicated RVA is 0x48.
> >
> > That's odd.  This matches what LLD does (though they do indeed do some
> > sanity checking):
> > https://github.com/llvm/llvm-project/blob/49f06075a6d298bd13564c9bffcf51281bed4962/lld/COFF/SymbolTable.cpp#L576
>
> As one more data point, a quote from MS doc: "For Windows XP, the size must
> be specified as 64 for x86 images." It looks odd to me though that the exact
> same size is stated for both the 32- and 64-bit struct.
>
> Since hopefully hardly anyone still cares about XP, doing what you do (and
> what llvm does) is likely fine, but imo wants to come with a code comment
> then.


* Add note to commit message about bfd_get_symbol_leading_char actually
returning the char in question, and code shouldn't be assuming it's '_'.
(I've only dealt with this on Windows, and other code I've dealt with
generally has a boolean as to whether to add an underscore or not (and
then only sets it to true on i386)).

* Add sanity checks (symbol + size is still contained in section,
required alignment is satisfied)

* Add code comment about possible issues on Windows XP where it might be
necessary to lie about the size in the directory entry?

Feel free to correct me if any of that sounds wrong.  Also, I wouldn't be
offended if somebody else picks this up in the meantime and did a
Co-authored-by or something.  I do still have another patch series in
https://github.com/jeremyd2019/binutils-gdb/pulls fixing delay-load
crashes introduced in 2.44.  That's probably more important than this, but
I was hoping to figure out the process and pitfalls of contributing to
binutils before tackling that more complicated change.  Unfortunately life
has intervened and I may not be able to get back to that soon either, so
again, feel free to pick it up and run with that too.
  

Patch

diff --git a/bfd/peXXigen.c b/bfd/peXXigen.c
index 29012688adf..858498a04f3 100644
--- a/bfd/peXXigen.c
+++ b/bfd/peXXigen.c
@@ -593,7 +593,7 @@  _bfd_XXi_swap_aouthdr_out (bfd * abfd, void * in, void * out)
   struct internal_extra_pe_aouthdr *extra = &pe->pe_opthdr;
   PEAOUTHDR *aouthdr_out = (PEAOUTHDR *) out;
   bfd_vma sa, fa, ib;
-  IMAGE_DATA_DIRECTORY idata2, idata5, tls;
+  IMAGE_DATA_DIRECTORY idata2, idata5, tls, loadcfg;

   sa = extra->SectionAlignment;
   fa = extra->FileAlignment;
@@ -602,6 +602,7 @@  _bfd_XXi_swap_aouthdr_out (bfd * abfd, void * in, void * out)
   idata2 = pe->pe_opthdr.DataDirectory[PE_IMPORT_TABLE];
   idata5 = pe->pe_opthdr.DataDirectory[PE_IMPORT_ADDRESS_TABLE];
   tls = pe->pe_opthdr.DataDirectory[PE_TLS_TABLE];
+  loadcfg = pe->pe_opthdr.DataDirectory[PE_LOAD_CONFIG_TABLE];

   if (aouthdr_in->tsize)
     {
@@ -651,6 +652,7 @@  _bfd_XXi_swap_aouthdr_out (bfd * abfd, void * in, void * out)
   extra->DataDirectory[PE_IMPORT_TABLE]  = idata2;
   extra->DataDirectory[PE_IMPORT_ADDRESS_TABLE] = idata5;
   extra->DataDirectory[PE_TLS_TABLE] = tls;
+  extra->DataDirectory[PE_LOAD_CONFIG_TABLE] = loadcfg;

   if (extra->DataDirectory[PE_IMPORT_TABLE].VirtualAddress == 0)
     /* Until other .idata fixes are made (pending patch), the entry for
@@ -4558,7 +4560,7 @@  _bfd_XXi_final_link_postscript (bfd * abfd, struct coff_final_link_info *pfinfo)
       else
 	{
 	  _bfd_error_handler
-	    (_("%pB: unable to fill in DataDictionary[9] because __tls_used is missing"),
+	    (_("%pB: unable to fill in DataDictionary[9] because _tls_used is not defined correctly"),
 	     abfd);
 	  result = false;
 	}
@@ -4573,6 +4575,48 @@  _bfd_XXi_final_link_postscript (bfd * abfd, struct coff_final_link_info *pfinfo)
 #endif
     }

+  h1 = coff_link_hash_lookup (coff_hash_table (info),
+			      (bfd_get_symbol_leading_char (abfd) != 0
+			       ? "__load_config_used" : "_load_config_used"),
+			      false, false, true);
+  if (h1 != NULL)
+    {
+      char data[4];
+      if ((h1->root.type == bfd_link_hash_defined
+	   || h1->root.type == bfd_link_hash_defweak)
+	  && h1->root.u.def.section != NULL
+	  && h1->root.u.def.section->output_section != NULL)
+	{
+	  pe_data (abfd)->pe_opthdr.DataDirectory[PE_LOAD_CONFIG_TABLE].VirtualAddress =
+	    (h1->root.u.def.value
+	     + h1->root.u.def.section->output_section->vma
+	     + h1->root.u.def.section->output_offset
+	     - pe_data (abfd)->pe_opthdr.ImageBase);
+
+	  /* the size is stored as the first 4 bytes at _load_config_used */
+	  if (bfd_get_section_contents (abfd,
+		h1->root.u.def.section->output_section, data,
+		h1->root.u.def.section->output_offset + h1->root.u.def.value,
+		4))
+	    pe_data (abfd)->pe_opthdr.DataDirectory[PE_LOAD_CONFIG_TABLE].Size =
+	      bfd_get_32 (abfd, data);
+	  else
+	    {
+	      _bfd_error_handler
+		(_("%pB: unable to fill in DataDictionary[10] because the size can't be read"),
+		 abfd);
+	      result = false;
+	    }
+	}
+      else
+	{
+	  _bfd_error_handler
+	    (_("%pB: unable to fill in DataDictionary[10] because _load_config_used is not defined correctly"),
+	     abfd);
+	  result = false;
+	}
+    }
+
 /* If there is a .pdata section and we have linked pdata finally, we
      need to sort the entries ascending.  */
 #if !defined(COFF_WITH_pep) && (defined(COFF_WITH_pex64) || defined(COFF_WITH_peAArch64) || defined(COFF_WITH_peLoongArch64) || defined (COFF_WITH_peRiscV64))