[v3] ld/PE: Handle MS style import libraries for files named *.exe too
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_binutils_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_binutils_check--master-arm |
success
|
Test passed
|
Commit Message
When handling MS style import libraries (also called short import
libraries, or ILF), we need to detect the kind of library.
So far, this has been done by looking at the member file names
in the import library - in an MS style import library, all the
member files for a specific library have the same member file
name - the name of the runtime module to link against. Usually
this is a DLL - thus we do a case insensitive comparison and
check if the suffix is .dll.
However, an .exe can also export symbols which can be linked
against in the same way. In particular, if linking against
WDK (Windows Driver Kit) import libraries, e.g. wdmsec.lib, the
import libraries can provide imports for ntoskrnl.exe.
Instead of specifically checking for *.dll (and *.exe, etc),
invert the condition and skip archive members named *.o and *.obj.
For any remaining archive members, that do contain .idata
sections, apply the renaming. (The renaming is also mostly
harmless if applied where it isn't needed; if archive members
already have unique file names, their relative ordering should
remain intact except for very contrieved cases.)
---
v3: Updated based on the earlier discussion. Also tested the
effect of applying renaming on GNU style import libraries where
no renaming is needed - where it is quite harmless.
---
ld/emultempl/pe.em | 29 ++++++++++++++++++++++-------
ld/emultempl/pep.em | 29 ++++++++++++++++++++++-------
2 files changed, 44 insertions(+), 14 deletions(-)
Comments
On 19.11.2024 14:27, Martin Storsjö wrote:
> When handling MS style import libraries (also called short import
> libraries, or ILF), we need to detect the kind of library.
>
> So far, this has been done by looking at the member file names
> in the import library - in an MS style import library, all the
> member files for a specific library have the same member file
> name - the name of the runtime module to link against. Usually
> this is a DLL - thus we do a case insensitive comparison and
> check if the suffix is .dll.
>
> However, an .exe can also export symbols which can be linked
> against in the same way. In particular, if linking against
> WDK (Windows Driver Kit) import libraries, e.g. wdmsec.lib, the
> import libraries can provide imports for ntoskrnl.exe.
>
> Instead of specifically checking for *.dll (and *.exe, etc),
> invert the condition and skip archive members named *.o and *.obj.
> For any remaining archive members, that do contain .idata
> sections, apply the renaming. (The renaming is also mostly
> harmless if applied where it isn't needed; if archive members
> already have unique file names, their relative ordering should
> remain intact except for very contrieved cases.)
Okay. Just one nit on comment text:
> --- a/ld/emultempl/pe.em
> +++ b/ld/emultempl/pe.em
> @@ -1747,13 +1747,28 @@ gld${EMULATION_NAME}_after_open (void)
>
> /* Microsoft import libraries may contain archive members for
> one or more DLLs, together with static object files.
> - Inspect all members that are named *.dll - check whether
> - they contain .idata sections. Do the renaming of all
> - archive members that seem to be Microsoft style import
> - objects. */
> + The head and sentinels are regular COFF object files,
> + while the thunks are special ILF files that get synthesized
> + by bfd into COFF object files.
> +
> + As Microsoft import libraries can be for a module with
> + almost any file name (*.dll, *.exe, etc), we can't easily
> + know which archive members to inspect.
> +
> + Inspect all members unless ones named *.o or *.obj (which
While I'm not a native speaker, I think this wants to be either "Inspect
all members, except ones ..." or "Inspect all members unless named ...".
Jan
On Thu, 21 Nov 2024, Jan Beulich wrote:
> On 19.11.2024 14:27, Martin Storsjö wrote:
>> When handling MS style import libraries (also called short import
>> libraries, or ILF), we need to detect the kind of library.
>>
>> So far, this has been done by looking at the member file names
>> in the import library - in an MS style import library, all the
>> member files for a specific library have the same member file
>> name - the name of the runtime module to link against. Usually
>> this is a DLL - thus we do a case insensitive comparison and
>> check if the suffix is .dll.
>>
>> However, an .exe can also export symbols which can be linked
>> against in the same way. In particular, if linking against
>> WDK (Windows Driver Kit) import libraries, e.g. wdmsec.lib, the
>> import libraries can provide imports for ntoskrnl.exe.
>>
>> Instead of specifically checking for *.dll (and *.exe, etc),
>> invert the condition and skip archive members named *.o and *.obj.
>> For any remaining archive members, that do contain .idata
>> sections, apply the renaming. (The renaming is also mostly
>> harmless if applied where it isn't needed; if archive members
>> already have unique file names, their relative ordering should
>> remain intact except for very contrieved cases.)
>
> Okay. Just one nit on comment text:
>
>> --- a/ld/emultempl/pe.em
>> +++ b/ld/emultempl/pe.em
>> @@ -1747,13 +1747,28 @@ gld${EMULATION_NAME}_after_open (void)
>>
>> /* Microsoft import libraries may contain archive members for
>> one or more DLLs, together with static object files.
>> - Inspect all members that are named *.dll - check whether
>> - they contain .idata sections. Do the renaming of all
>> - archive members that seem to be Microsoft style import
>> - objects. */
>> + The head and sentinels are regular COFF object files,
>> + while the thunks are special ILF files that get synthesized
>> + by bfd into COFF object files.
>> +
>> + As Microsoft import libraries can be for a module with
>> + almost any file name (*.dll, *.exe, etc), we can't easily
>> + know which archive members to inspect.
>> +
>> + Inspect all members unless ones named *.o or *.obj (which
>
> While I'm not a native speaker, I think this wants to be either "Inspect
> all members, except ones ..." or "Inspect all members unless named ...".
Oh, indeed, yes.
Ok to push with that changed?
// Martin
On 21.11.2024 13:21, Martin Storsjö wrote:
> On Thu, 21 Nov 2024, Jan Beulich wrote:
>
>> On 19.11.2024 14:27, Martin Storsjö wrote:
>>> When handling MS style import libraries (also called short import
>>> libraries, or ILF), we need to detect the kind of library.
>>>
>>> So far, this has been done by looking at the member file names
>>> in the import library - in an MS style import library, all the
>>> member files for a specific library have the same member file
>>> name - the name of the runtime module to link against. Usually
>>> this is a DLL - thus we do a case insensitive comparison and
>>> check if the suffix is .dll.
>>>
>>> However, an .exe can also export symbols which can be linked
>>> against in the same way. In particular, if linking against
>>> WDK (Windows Driver Kit) import libraries, e.g. wdmsec.lib, the
>>> import libraries can provide imports for ntoskrnl.exe.
>>>
>>> Instead of specifically checking for *.dll (and *.exe, etc),
>>> invert the condition and skip archive members named *.o and *.obj.
>>> For any remaining archive members, that do contain .idata
>>> sections, apply the renaming. (The renaming is also mostly
>>> harmless if applied where it isn't needed; if archive members
>>> already have unique file names, their relative ordering should
>>> remain intact except for very contrieved cases.)
>>
>> Okay. Just one nit on comment text:
>>
>>> --- a/ld/emultempl/pe.em
>>> +++ b/ld/emultempl/pe.em
>>> @@ -1747,13 +1747,28 @@ gld${EMULATION_NAME}_after_open (void)
>>>
>>> /* Microsoft import libraries may contain archive members for
>>> one or more DLLs, together with static object files.
>>> - Inspect all members that are named *.dll - check whether
>>> - they contain .idata sections. Do the renaming of all
>>> - archive members that seem to be Microsoft style import
>>> - objects. */
>>> + The head and sentinels are regular COFF object files,
>>> + while the thunks are special ILF files that get synthesized
>>> + by bfd into COFF object files.
>>> +
>>> + As Microsoft import libraries can be for a module with
>>> + almost any file name (*.dll, *.exe, etc), we can't easily
>>> + know which archive members to inspect.
>>> +
>>> + Inspect all members unless ones named *.o or *.obj (which
>>
>> While I'm not a native speaker, I think this wants to be either "Inspect
>> all members, except ones ..." or "Inspect all members unless named ...".
>
> Oh, indeed, yes.
>
> Ok to push with that changed?
Yes.
Jan
On Thu, 21 Nov 2024, Jan Beulich wrote:
> On 21.11.2024 13:21, Martin Storsjö wrote:
>> On Thu, 21 Nov 2024, Jan Beulich wrote:
>>
>>> On 19.11.2024 14:27, Martin Storsjö wrote:
>>>> When handling MS style import libraries (also called short import
>>>> libraries, or ILF), we need to detect the kind of library.
>>>>
>>>> So far, this has been done by looking at the member file names
>>>> in the import library - in an MS style import library, all the
>>>> member files for a specific library have the same member file
>>>> name - the name of the runtime module to link against. Usually
>>>> this is a DLL - thus we do a case insensitive comparison and
>>>> check if the suffix is .dll.
>>>>
>>>> However, an .exe can also export symbols which can be linked
>>>> against in the same way. In particular, if linking against
>>>> WDK (Windows Driver Kit) import libraries, e.g. wdmsec.lib, the
>>>> import libraries can provide imports for ntoskrnl.exe.
>>>>
>>>> Instead of specifically checking for *.dll (and *.exe, etc),
>>>> invert the condition and skip archive members named *.o and *.obj.
>>>> For any remaining archive members, that do contain .idata
>>>> sections, apply the renaming. (The renaming is also mostly
>>>> harmless if applied where it isn't needed; if archive members
>>>> already have unique file names, their relative ordering should
>>>> remain intact except for very contrieved cases.)
>>>
>>> Okay. Just one nit on comment text:
>>>
>>>> --- a/ld/emultempl/pe.em
>>>> +++ b/ld/emultempl/pe.em
>>>> @@ -1747,13 +1747,28 @@ gld${EMULATION_NAME}_after_open (void)
>>>>
>>>> /* Microsoft import libraries may contain archive members for
>>>> one or more DLLs, together with static object files.
>>>> - Inspect all members that are named *.dll - check whether
>>>> - they contain .idata sections. Do the renaming of all
>>>> - archive members that seem to be Microsoft style import
>>>> - objects. */
>>>> + The head and sentinels are regular COFF object files,
>>>> + while the thunks are special ILF files that get synthesized
>>>> + by bfd into COFF object files.
>>>> +
>>>> + As Microsoft import libraries can be for a module with
>>>> + almost any file name (*.dll, *.exe, etc), we can't easily
>>>> + know which archive members to inspect.
>>>> +
>>>> + Inspect all members unless ones named *.o or *.obj (which
>>>
>>> While I'm not a native speaker, I think this wants to be either "Inspect
>>> all members, except ones ..." or "Inspect all members unless named ...".
>>
>> Oh, indeed, yes.
>>
>> Ok to push with that changed?
>
> Yes.
Thanks, this is pushed now.
// Martin
@@ -1747,13 +1747,28 @@ gld${EMULATION_NAME}_after_open (void)
/* Microsoft import libraries may contain archive members for
one or more DLLs, together with static object files.
- Inspect all members that are named *.dll - check whether
- they contain .idata sections. Do the renaming of all
- archive members that seem to be Microsoft style import
- objects. */
+ The head and sentinels are regular COFF object files,
+ while the thunks are special ILF files that get synthesized
+ by bfd into COFF object files.
+
+ As Microsoft import libraries can be for a module with
+ almost any file name (*.dll, *.exe, etc), we can't easily
+ know which archive members to inspect.
+
+ Inspect all members unless ones named *.o or *.obj (which
+ is the case both for regular static libraries or for GNU
+ style import libraries). Archive members with file names other
+ than *.o or *.obj, that do contain .idata sections, are
+ considered to be Microsoft style import objects, and are
+ renamed accordingly.
+
+ If this heuristic is wrong and we apply this on archive members
+ that already have unique names, it shouldn't make any difference
+ as we only append a suffix on the names. */
pnt = strrchr (bfd_get_filename (is->the_bfd), '.');
- if (pnt != NULL && (fileext_cmp (pnt + 1, "dll") == 0))
+ if (pnt != NULL && (fileext_cmp (pnt + 1, "o") != 0 &&
+ fileext_cmp (pnt + 1, "obj") != 0))
{
int idata2 = 0, reloc_count = 0, idata = 0;
asection *sec;
@@ -1768,8 +1783,8 @@ gld${EMULATION_NAME}_after_open (void)
reloc_count += sec->reloc_count;
}
- /* An archive member named .dll, but not having any .idata
- sections - apparently not a Microsoft import object
+ /* An archive member not named .o or .obj, but not having any
+ .idata sections - apparently not a Microsoft import object
after all: Skip renaming it. */
if (!idata)
continue;
@@ -1730,13 +1730,28 @@ gld${EMULATION_NAME}_after_open (void)
/* Microsoft import libraries may contain archive members for
one or more DLLs, together with static object files.
- Inspect all members that are named *.dll - check whether
- they contain .idata sections. Do the renaming of all
- archive members that seem to be Microsoft style import
- objects. */
+ The head and sentinels are regular COFF object files,
+ while the thunks are special ILF files that get synthesized
+ by bfd into COFF object files.
+
+ As Microsoft import libraries can be for a module with
+ almost any file name (*.dll, *.exe, etc), we can't easily
+ know which archive members to inspect.
+
+ Inspect all members unless ones named *.o or *.obj (which
+ is the case both for regular static libraries or for GNU
+ style import libraries). Archive members with file names other
+ than *.o or *.obj, that do contain .idata sections, are
+ considered to be Microsoft style import objects, and are
+ renamed accordingly.
+
+ If this heuristic is wrong and we apply this on archive members
+ that already have unique names, it shouldn't make any difference
+ as we only append a suffix on the names. */
pnt = strrchr (bfd_get_filename (is->the_bfd), '.');
- if (pnt != NULL && (fileext_cmp (pnt + 1, "dll") == 0))
+ if (pnt != NULL && (fileext_cmp (pnt + 1, "o") != 0 &&
+ fileext_cmp (pnt + 1, "obj") != 0))
{
int idata2 = 0, reloc_count = 0, idata = 0;
asection *sec;
@@ -1751,8 +1766,8 @@ gld${EMULATION_NAME}_after_open (void)
reloc_count += sec->reloc_count;
}
- /* An archive member named .dll, but not having any .idata
- sections - apparently not a Microsoft import object
+ /* An archive member not named .o or .obj, but not having any
+ .idata sections - apparently not a Microsoft import object
after all: Skip renaming it. */
if (!idata)
continue;