RISC-V: hash with segment id and pcrel_hi address while recording pcrel_hi
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_binutils_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_binutils_check--master-arm |
success
|
Test passed
|
Commit Message
From: Lifang Xia <lifang_xia@linux.alibaba.com>
When the same address across different segments needs to be recorded, it will
overwrite the slot, leading to a memory leak. To ensure uniqueness, the
segment ID needs to be included in the hash key calculation.
gas/
* config/tc-riscv.c (riscv_pcrel_hi_fixup): New member "seg".
(riscv_pcrel_fixup_hash): Get hash with seg->id and e->adrsess.
(riscv_pcrel_fixup_eq): Check seg->id at first.
(riscv_record_pcrel_fixup): New member "seg".
(md_apply_fix) <case BFD_RELOC_RISCV_PCREL_HI20>: Likewise.
(md_apply_fix) <case BFD_RELOC_RISCV_PCREL_LO12_I>: Likewise.
---
gas/config/tc-riscv.c | 27 +++++++++++++++++----------
1 file changed, 17 insertions(+), 10 deletions(-)
Comments
On Fri, Jun 21, 2024 at 9:25 AM <lifang_xia@linux.alibaba.com> wrote:
> From: Lifang Xia <lifang_xia@linux.alibaba.com>
>
> When the same address across different segments needs to be recorded, it
> will
> overwrite the slot, leading to a memory leak. To ensure uniqueness, the
> segment ID needs to be included in the hash key calculation.
>
Yeah, this makes sense.
> @@ -1798,7 +1799,13 @@ static hashval_t
> riscv_pcrel_fixup_hash (const void *entry)
> {
> const riscv_pcrel_hi_fixup *e = entry;
> - return (hashval_t) (e->address);
> +
> + /* the pcrel_hi with same address may reside in different segments,
> + to ensure uniqueness, the segment ID needs to be included in the
> + hash key calculation.
> + There is no hash for two integer, refer to ctf_hash_integer. */
> + return htab_hash_pointer ((void *)(uintptr_t)e->seg->id)
> + + 59 * htab_hash_pointer ((void *)(uintptr_t)e->address);
> }
>
The riscv_segment_info_type is used to record target stuff for special
works. We used to record mapping symbol stuff there. Maybe we can have a
pcrel_hi hash table with address as key, in each riscv_segment_info_type if
there is/are pcrel_hi instructions, and have another summarized hash table
with segment id as key. I am not familiar with the ctf_hash_integer stuff,
so not sure which solution is better. I think Alan, Nick and Jan know
these better since they are experts, so cc them, and hope they may have
time to give us some suggestions ;)
Hi Alan, Nick, Jan,
We met the problem that there are no internal hash tables that can use two
keys - one key is the segment id, the other is the riscv auipc (pcrel hi)
instruction address in the segment. Do you think it's fine to refer to how
ctf_hash_integer works in this patch? Or we should have multiple hash
tables for each segment in riscv_segment_info_type if there are auipc
instructions (auipc address of segment as key), and have another one hash
table to collect the segment (segment id as key) which has any auipc.
Thanks in advance
Nelson
On 25.06.2024 06:31, Nelson Chu wrote:
> On Fri, Jun 21, 2024 at 9:25 AM <lifang_xia@linux.alibaba.com> wrote:
>
>> From: Lifang Xia <lifang_xia@linux.alibaba.com>
>>
>> When the same address across different segments needs to be recorded, it
>> will
>> overwrite the slot, leading to a memory leak. To ensure uniqueness, the
>> segment ID needs to be included in the hash key calculation.
>>
>
> Yeah, this makes sense.
>
>
>> @@ -1798,7 +1799,13 @@ static hashval_t
>> riscv_pcrel_fixup_hash (const void *entry)
>> {
>> const riscv_pcrel_hi_fixup *e = entry;
>> - return (hashval_t) (e->address);
>> +
>> + /* the pcrel_hi with same address may reside in different segments,
>> + to ensure uniqueness, the segment ID needs to be included in the
>> + hash key calculation.
>> + There is no hash for two integer, refer to ctf_hash_integer. */
>> + return htab_hash_pointer ((void *)(uintptr_t)e->seg->id)
>> + + 59 * htab_hash_pointer ((void *)(uintptr_t)e->address);
>> }
>>
>
> The riscv_segment_info_type is used to record target stuff for special
> works. We used to record mapping symbol stuff there. Maybe we can have a
> pcrel_hi hash table with address as key, in each riscv_segment_info_type if
> there is/are pcrel_hi instructions, and have another summarized hash table
> with segment id as key. I am not familiar with the ctf_hash_integer stuff,
> so not sure which solution is better. I think Alan, Nick and Jan know
> these better since they are experts, so cc them, and hope they may have
> time to give us some suggestions ;)
>
>
> Hi Alan, Nick, Jan,
>
> We met the problem that there are no internal hash tables that can use two
> keys - one key is the segment id, the other is the riscv auipc (pcrel hi)
> instruction address in the segment. Do you think it's fine to refer to how
> ctf_hash_integer works in this patch? Or we should have multiple hash
> tables for each segment in riscv_segment_info_type if there are auipc
> instructions (auipc address of segment as key), and have another one hash
> table to collect the segment (segment id as key) which has any auipc.
In principle that's fine, I think, just that the reference to
ctf_hash_integer() seems misleading. The two similar CTF functions are
ctf_hash_type_key() and ctf_hash_type_id_key(), afaics.
I don't really understand, though, why htab_hash_pointer() wants / needs
using here. Wouldn't (deriving from what was there before)
"e->address + 59 * e->seg->id" suffice (multiplier subject to possible
improvement), and be less of a behavioral change?
Jan
> 2024年6月25日 16:25,Jan Beulich <jbeulich@suse.com> 写道:
>
> On 25.06.2024 06:31, Nelson Chu wrote:
>> On Fri, Jun 21, 2024 at 9:25 AM <lifang_xia@linux.alibaba.com> wrote:
>>
>>> From: Lifang Xia <lifang_xia@linux.alibaba.com>
>>>
>>> When the same address across different segments needs to be recorded, it
>>> will
>>> overwrite the slot, leading to a memory leak. To ensure uniqueness, the
>>> segment ID needs to be included in the hash key calculation.
>>>
>>
>> Yeah, this makes sense.
>>
>>
>>> @@ -1798,7 +1799,13 @@ static hashval_t
>>> riscv_pcrel_fixup_hash (const void *entry)
>>> {
>>> const riscv_pcrel_hi_fixup *e = entry;
>>> - return (hashval_t) (e->address);
>>> +
>>> + /* the pcrel_hi with same address may reside in different segments,
>>> + to ensure uniqueness, the segment ID needs to be included in the
>>> + hash key calculation.
>>> + There is no hash for two integer, refer to ctf_hash_integer. */
>>> + return htab_hash_pointer ((void *)(uintptr_t)e->seg->id)
>>> + + 59 * htab_hash_pointer ((void *)(uintptr_t)e->address);
>>> }
>>>
>>
>> The riscv_segment_info_type is used to record target stuff for special
>> works. We used to record mapping symbol stuff there. Maybe we can have a
>> pcrel_hi hash table with address as key, in each riscv_segment_info_type if
>> there is/are pcrel_hi instructions, and have another summarized hash table
>> with segment id as key. I am not familiar with the ctf_hash_integer stuff,
>> so not sure which solution is better. I think Alan, Nick and Jan know
>> these better since they are experts, so cc them, and hope they may have
>> time to give us some suggestions ;)
>>
>>
>> Hi Alan, Nick, Jan,
>>
>> We met the problem that there are no internal hash tables that can use two
>> keys - one key is the segment id, the other is the riscv auipc (pcrel hi)
>> instruction address in the segment. Do you think it's fine to refer to how
>> ctf_hash_integer works in this patch? Or we should have multiple hash
>> tables for each segment in riscv_segment_info_type if there are auipc
>> instructions (auipc address of segment as key), and have another one hash
>> table to collect the segment (segment id as key) which has any auipc.
>
> In principle that's fine, I think, just that the reference to
> ctf_hash_integer() seems misleading. The two similar CTF functions are
> ctf_hash_type_key() and ctf_hash_type_id_key(), afaics.
Em.. Sorry, It’s a mistake. I went to comment with “ctf_hash_type_key”.
>
> I don't really understand, though, why htab_hash_pointer() wants / needs
> using here. Wouldn't (deriving from what was there before)
> "e->address + 59 * e->seg->id" suffice (multiplier subject to possible
> improvement), and be less of a behavioral change?
"e->address + 59 * e->seg->id” would be fine. It's just necessary
to ensure uniqueness within a finite range.
As for the multiplier, do you have any suggestions? I feel that 59 might be
too small. Is 127 a better choice?
> Jan
On 02.07.2024 05:20, lifang_xia wrote:
>> 2024年6月25日 16:25,Jan Beulich <jbeulich@suse.com> 写道:
>> I don't really understand, though, why htab_hash_pointer() wants / needs
>> using here. Wouldn't (deriving from what was there before)
>> "e->address + 59 * e->seg->id" suffice (multiplier subject to possible
>> improvement), and be less of a behavioral change?
> "e->address + 59 * e->seg->id” would be fine. It's just necessary
> to ensure uniqueness within a finite range.
> As for the multiplier, do you have any suggestions? I feel that 59 might be
> too small. Is 127 a better choice?
Why would you think the absolute value matters much? As to 127, I'd view
this as sub-optimal, for being 2^^7-1 (i.e. a little too "regular"). But
surely you should be fine picking whatever larger prime you like.
Jan
@@ -1784,6 +1784,7 @@ init_opcode_hash (const struct riscv_opcode *opcodes,
help us resolve the corresponding low-part relocation later. */
typedef struct
{
+ segT seg;
bfd_vma address;
symbolS *symbol;
bfd_vma target;
@@ -1798,7 +1799,13 @@ static hashval_t
riscv_pcrel_fixup_hash (const void *entry)
{
const riscv_pcrel_hi_fixup *e = entry;
- return (hashval_t) (e->address);
+
+ /* the pcrel_hi with same address may reside in different segments,
+ to ensure uniqueness, the segment ID needs to be included in the
+ hash key calculation.
+ There is no hash for two integer, refer to ctf_hash_integer. */
+ return htab_hash_pointer ((void *)(uintptr_t)e->seg->id)
+ + 59 * htab_hash_pointer ((void *)(uintptr_t)e->address);
}
/* Compare the keys between two entries fo the pcrel_hi hash table. */
@@ -1807,16 +1814,17 @@ static int
riscv_pcrel_fixup_eq (const void *entry1, const void *entry2)
{
const riscv_pcrel_hi_fixup *e1 = entry1, *e2 = entry2;
- return e1->address == e2->address;
+ return e1->seg->id == e2->seg->id
+ && e1->address == e2->address;
}
/* Record the pcrel_hi relocation. */
static bool
-riscv_record_pcrel_fixup (htab_t p, bfd_vma address, symbolS *symbol,
+riscv_record_pcrel_fixup (htab_t p, segT seg, bfd_vma address, symbolS *symbol,
bfd_vma target)
{
- riscv_pcrel_hi_fixup entry = {address, symbol, target};
+ riscv_pcrel_hi_fixup entry = {seg, address, symbol, target};
riscv_pcrel_hi_fixup **slot =
(riscv_pcrel_hi_fixup **) htab_find_slot (p, &entry, INSERT);
if (slot == NULL)
@@ -4676,11 +4684,10 @@ md_apply_fix (fixS *fixP, valueT *valP, segT seg ATTRIBUTE_UNUSED)
bfd_vma value = target - md_pcrel_from (fixP);
/* Record PCREL_HI20. */
- if (!riscv_record_pcrel_fixup (riscv_pcrel_hi_fixup_hash,
- md_pcrel_from (fixP),
- fixP->fx_addsy,
- target))
- as_warn (_("too many pcrel_hi"));
+ if (!riscv_record_pcrel_fixup (riscv_pcrel_hi_fixup_hash, seg,
+ md_pcrel_from (fixP), fixP->fx_addsy,
+ target))
+ as_warn (_("too many pcrel_hi"));
bfd_putl32 (bfd_getl32 (buf)
| ENCODE_UTYPE_IMM (RISCV_CONST_HIGH_PART (value)),
@@ -4698,7 +4705,7 @@ md_apply_fix (fixS *fixP, valueT *valP, segT seg ATTRIBUTE_UNUSED)
and set fx_done for -mno-relax. */
{
bfd_vma location_pcrel_hi = S_GET_VALUE (fixP->fx_addsy) + *valP;
- riscv_pcrel_hi_fixup search = {location_pcrel_hi, 0, 0};
+ riscv_pcrel_hi_fixup search = {seg, location_pcrel_hi, 0, 0};
riscv_pcrel_hi_fixup *entry = htab_find (riscv_pcrel_hi_fixup_hash,
&search);
if (entry && entry->symbol