Fix writing object contents above 4GB in tekhex format
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_binutils_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_binutils_check--master-arm |
success
|
Test passed
|
Commit Message
Writing tekhex object with addresses above 4GB resulted in a corrupted object
contents, because the upper 32-bit of the address were silently ignored
in `writevalue`.
While reading a 64-bit address section did work in tekhex format, with the
address length encoded as 0.
This code adds the neccesary support for writing the upper bits of the
address, while not changing the existing behavior for addresses below
4GB.
Signed-off-by: Dani Szebenyi <szedani@linux.ibm.com>
---
bfd/tekhex.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Comments
Hello Dani!
On 23.10.2024 17:10, Dani Szebenyi wrote:
> Writing tekhex object with addresses above 4GB resulted in a corrupted object
> contents, because the upper 32-bit of the address were silently ignored
> in `writevalue`.
> While reading a 64-bit address section did work in tekhex format, with the
> address length encoded as 0.
The extended Tektronix hex format encodes an address as one (hex?)digit
address-field-length N followed by the N hexadecimal digits address.
Both getvalue() and getsym() treat the address length as hexadecimal and
a value of 0 as if it was decimal 16 when reading tekhex:
len = hex_value (*src++);
if (len == 0)
len = 16;
> This code adds the neccesary support for writing the upper bits of the
> address, while not changing the existing behavior for addresses below
> 4GB.
Therefore I agree with you that it would be fair to also do so in
writevalue() when writing tekhex.
Btw. the following likely unofficial documentation agrees with this
encoding (using "0" for 16):
http://ftp.dataio.com/main/manuals/unifam/translation%20formats.pdf
> Signed-off-by: Dani Szebenyi <szedani@linux.ibm.com>
> ---
> bfd/tekhex.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/bfd/tekhex.c b/bfd/tekhex.c
> index 96022c585ab..46f40b19817 100644
> --- a/bfd/tekhex.c
> +++ b/bfd/tekhex.c
> @@ -719,11 +719,11 @@ writevalue (char **dst, bfd_vma value)
> int len;
> int shift;
>
> - for (len = 8, shift = 28; shift; shift -= 4, len--)
> + for (len = 16, shift = 60; shift; shift -= 4, len--)
> {
> if ((value >> shift) & 0xf)
> {
> - *p++ = len + '0';
> + *p++ = digs[len & 0xf];
> while (len)
> {
> *p++ = digs[(value >> shift) & 0xf];
Looks good to me! Hope Nick/Alan agree.
With this patch I do wonder whether gdb/testsuite/gdb.base/dump.exp
could also be changed to enable test reload of tekhex on 64-bit
architectures?
Thanks and regards,
Jens
On Wed, Oct 23, 2024 at 06:04:34PM +0200, Jens Remus wrote:
> Hello Dani!
>
> On 23.10.2024 17:10, Dani Szebenyi wrote:
> > Writing tekhex object with addresses above 4GB resulted in a corrupted object
> > contents, because the upper 32-bit of the address were silently ignored
> > in `writevalue`.
> > While reading a 64-bit address section did work in tekhex format, with the
> > address length encoded as 0.
>
> The extended Tektronix hex format encodes an address as one (hex?)digit
> address-field-length N followed by the N hexadecimal digits address.
>
> Both getvalue() and getsym() treat the address length as hexadecimal and a
> value of 0 as if it was decimal 16 when reading tekhex:
>
> len = hex_value (*src++);
> if (len == 0)
> len = 16;
>
> > This code adds the neccesary support for writing the upper bits of the
> > address, while not changing the existing behavior for addresses below
> > 4GB.
>
> Therefore I agree with you that it would be fair to also do so in
> writevalue() when writing tekhex.
>
> Btw. the following likely unofficial documentation agrees with this encoding
> (using "0" for 16):
> http://ftp.dataio.com/main/manuals/unifam/translation%20formats.pdf
>
> > Signed-off-by: Dani Szebenyi <szedani@linux.ibm.com>
> > ---
> > bfd/tekhex.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/bfd/tekhex.c b/bfd/tekhex.c
> > index 96022c585ab..46f40b19817 100644
> > --- a/bfd/tekhex.c
> > +++ b/bfd/tekhex.c
> > @@ -719,11 +719,11 @@ writevalue (char **dst, bfd_vma value)
> > int len;
> > int shift;
> > - for (len = 8, shift = 28; shift; shift -= 4, len--)
> > + for (len = 16, shift = 60; shift; shift -= 4, len--)
> > {
> > if ((value >> shift) & 0xf)
> > {
> > - *p++ = len + '0';
> > + *p++ = digs[len & 0xf];
> > while (len)
> > {
> > *p++ = digs[(value >> shift) & 0xf];
>
> Looks good to me! Hope Nick/Alan agree.
Um, no. You'll get undefined behaviour when bfd_vma is 32 bits.
Huh, and it looks like the function is broken in other ways. If, for
example, "value" is 3 you'll get the same output as "value" 0.
So I rewrote it and decided I ought to test it..
echo -n "123" > bin
binutils/objcopy -I binary bin -O tekhex xxx
binutils/objdump -s xxx
binutils/objdump: xxx: file format not recognized
Oh great, let's see what is going on. After debugging a little,
I see we have a problem with *ABS* symbols, and after fixing that,
the tekhex file created above:
%47621103132330000000000000000000000000000000000000000000000000000000000
%103EB5.data11013
%1F3015.data40_binary_bin_star10
%1E3C85.data4F_binary_bin_end13
%1F3585*ABS*20_binary_bin_size13
%0781010
now can be dumped.
binutils/objdump -sx xxx
xxx: file format tekhex
xxx
architecture: UNKNOWN!, flags 0x00000010:
HAS_SYMS
start address 0x00000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .data 00000003 00000000 00000000 00000000 2**0
CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
00000000 g .data _binary_bin_star
00000003 g .data _binary_bin_end
00000003 g *ABS* _binary_bin_size
Contents of section .data:
0000 313233 123
* tekhex.c (first_phase): Handle *ABS* symbols.
(writevalue): Rewrite.
diff --git a/bfd/tekhex.c b/bfd/tekhex.c
index 96022c585ab..16e8e185572 100644
--- a/bfd/tekhex.c
+++ b/bfd/tekhex.c
@@ -395,7 +395,7 @@ first_phase (bfd *abfd, int type, char *src, char * src_end)
if (!n)
return false;
memcpy (n, sym, len + 1);
- section = bfd_make_section (abfd, n);
+ section = bfd_make_section_old_way (abfd, n);
if (section == NULL)
return false;
}
@@ -719,24 +719,13 @@ writevalue (char **dst, bfd_vma value)
int len;
int shift;
- for (len = 8, shift = 28; shift; shift -= 4, len--)
- {
- if ((value >> shift) & 0xf)
- {
- *p++ = len + '0';
- while (len)
- {
- *p++ = digs[(value >> shift) & 0xf];
- shift -= 4;
- len--;
- }
- *dst = p;
- return;
+ for (len = BFD_ARCH_SIZE / 4, shift = len * 4 - 4; len > 1; shift -= 4, len--)
+ if ((value >> shift) & 0xf)
+ break;
- }
- }
- *p++ = '1';
- *p++ = '0';
+ *p++ = digs[len & 0xf];
+ for (; len; shift -= 4, len--)
+ *p++ = digs[(value >> shift) & 0xf];
*dst = p;
}
Hello Alan!
On 24.10.2024 09:33, Alan Modra wrote:
> On Wed, Oct 23, 2024 at 06:04:34PM +0200, Jens Remus wrote:
>> On 23.10.2024 17:10, Dani Szebenyi wrote:
...>>> bfd/tekhex.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/bfd/tekhex.c b/bfd/tekhex.c
>>> index 96022c585ab..46f40b19817 100644
>>> --- a/bfd/tekhex.c
>>> +++ b/bfd/tekhex.c
>>> @@ -719,11 +719,11 @@ writevalue (char **dst, bfd_vma value)
>>> int len;
>>> int shift;
>>> - for (len = 8, shift = 28; shift; shift -= 4, len--)
>>> + for (len = 16, shift = 60; shift; shift -= 4, len--)
>>> {
>>> if ((value >> shift) & 0xf)
>>> {
>>> - *p++ = len + '0';
>>> + *p++ = digs[len & 0xf];
>>> while (len)
>>> {
>>> *p++ = digs[(value >> shift) & 0xf];
>>
>> Looks good to me! Hope Nick/Alan agree.
>
> Um, no. You'll get undefined behaviour when bfd_vma is 32 bits.
Aww, I did not think of that.
> Huh, and it looks like the function is broken in other ways. If, for
> example, "value" is 3 you'll get the same output as "value" 0.
Good catch!
> So I rewrote it and decided I ought to test it..
>
> echo -n "123" > bin
> binutils/objcopy -I binary bin -O tekhex xxx
> binutils/objdump -s xxx
> binutils/objdump: xxx: file format not recognized
Dani did not notice when testing his use use case in GDB (dump tekhex ...).
> Oh great, let's see what is going on. After debugging a little,
> I see we have a problem with *ABS* symbols, and after fixing that,
> the tekhex file created above:
> %47621103132330000000000000000000000000000000000000000000000000000000000
> %103EB5.data11013
> %1F3015.data40_binary_bin_star10
> %1E3C85.data4F_binary_bin_end13
> %1F3585*ABS*20_binary_bin_size13
> %0781010
>
> now can be dumped.
> binutils/objdump -sx xxx
>
> xxx: file format tekhex
> xxx
> architecture: UNKNOWN!, flags 0x00000010:
> HAS_SYMS
> start address 0x00000000
>
> Sections:
> Idx Name Size VMA LMA File off Algn
> 0 .data 00000003 00000000 00000000 00000000 2**0
> CONTENTS, ALLOC, LOAD, DATA
> SYMBOL TABLE:
> 00000000 g .data _binary_bin_star
> 00000003 g .data _binary_bin_end
> 00000003 g *ABS* _binary_bin_size
>
>
> Contents of section .data:
> 0000 313233 123
Awesome, thank you!
> * tekhex.c (first_phase): Handle *ABS* symbols.
> (writevalue): Rewrite.
>
> diff --git a/bfd/tekhex.c b/bfd/tekhex.c
> index 96022c585ab..16e8e185572 100644
> --- a/bfd/tekhex.c
> +++ b/bfd/tekhex.c
> @@ -395,7 +395,7 @@ first_phase (bfd *abfd, int type, char *src, char * src_end)
> if (!n)
> return false;
> memcpy (n, sym, len + 1);
> - section = bfd_make_section (abfd, n);
> + section = bfd_make_section_old_way (abfd, n);
> if (section == NULL)
> return false;
> }
> @@ -719,24 +719,13 @@ writevalue (char **dst, bfd_vma value)
> int len;
> int shift;
>
> - for (len = 8, shift = 28; shift; shift -= 4, len--)
> - {
> - if ((value >> shift) & 0xf)
> - {
> - *p++ = len + '0';
> - while (len)
> - {
> - *p++ = digs[(value >> shift) & 0xf];
> - shift -= 4;
> - len--;
> - }
> - *dst = p;
> - return;
> + for (len = BFD_ARCH_SIZE / 4, shift = len * 4 - 4; len > 1; shift -= 4, len--)
> + if ((value >> shift) & 0xf)
> + break;
>
> - }
> - }
> - *p++ = '1';
> - *p++ = '0';
> + *p++ = digs[len & 0xf];
> + for (; len; shift -= 4, len--)
> + *p++ = digs[(value >> shift) & 0xf];
> *dst = p;
> }
Looks good to me. I see you already commited your patch. Thank you for
taking care!
Regards,
Jens
@@ -719,11 +719,11 @@ writevalue (char **dst, bfd_vma value)
int len;
int shift;
- for (len = 8, shift = 28; shift; shift -= 4, len--)
+ for (len = 16, shift = 60; shift; shift -= 4, len--)
{
if ((value >> shift) & 0xf)
{
- *p++ = len + '0';
+ *p++ = digs[len & 0xf];
while (len)
{
*p++ = digs[(value >> shift) & 0xf];