[2/2] libelf: Make sure ar_size starts with a digit before calling atol.

Message ID 20220317133051.100876-3-mark@klomp.org
State Committed
Headers
Series [1/2] libelf: Take map offset into account for Shdr alignment check in elf_begin |

Commit Message

Mark Wielaard March 17, 2022, 1:30 p.m. UTC
  The ar_size field is a 10 character string, not zero terminated, of
decimal digits right padded with spaces.  Make sure it actually starts
with a digit before calling atol on it.  We already make sure it is
zero terminated. Otherwise atol might produce unexpected results.

Signed-off-by: Mark Wielaard <mark@klomp.org>
---
 libelf/ChangeLog   | 4 ++++
 libelf/elf_begin.c | 5 +++++
 2 files changed, 9 insertions(+)
  

Comments

Evgeny Vereshchagin March 18, 2022, 9:11 a.m. UTC | #1
Hi,

> The ar_size field is a 10 character string, not zero terminated, of
> decimal digits right padded with spaces.  Make sure it actually starts
> with a digit before calling atol on it.  We already make sure it is
> zero terminated. Otherwise atol might produce unexpected results.

As far as I can tell the patch fixes that particular issue. Thanks! 

On a somewhat related note, looking at https://sourceware.org/bugzilla/show_bug.cgi?id=24085
where read_long_names started appending a trailing '\0' to strings without trailing spaces only I wonder if it would be better
to always append trailing zero bytes there? It would make ASan stop complaining
about read_long_names with ASAN_OPTIONS=strict_string_checks=1 (which is supposed to look for places where strings without trailing zeroes are passed
to functions expecting null-terminated strings). For example even with this patch applied run-arextract.sh seems to fail under
ASAN with strict_string_checks=1:
```
ASAN_OPTIONS=strict_string_checks=1 make check TESTS=run-arextract.sh VERBOSE=1
...
==126042==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd5d103c7c at pc 0x7fe87890156d bp 0x7ffd5d103ab0 sp 0x7ffd5d103260
READ of size 15 at 0x7ffd5d103c7c thread T0
    #0 0x7fe87890156c in StrtolFixAndCheck(void*, char const*, char**, char*, int) (/lib64/libasan.so.6+0x6b56c)
    #1 0x7fe878901865 in __interceptor_strtol (/lib64/libasan.so.6+0x6b865)
    #2 0x7fe87885ecf9 in atol /usr/include/stdlib.h:368
    #3 0x7fe87885ecf9 in read_long_names /home/vagrant/oss-fuzz/projects/elfutils/elfutils/libelf/elf_begin.c:773
    #4 0x7fe87885ecf9 in __libelf_next_arhdr_wrlock /home/vagrant/oss-fuzz/projects/elfutils/elfutils/libelf/elf_begin.c:919
    #5 0x7fe87885fb25 in elf_next /home/vagrant/oss-fuzz/projects/elfutils/elfutils/libelf/elf_next.c:63
    #6 0x401360 in main /home/vagrant/oss-fuzz/projects/elfutils/elfutils/tests/arextract.c:146
    #7 0x7fe87867555f in __libc_start_call_main (/lib64/libc.so.6+0x2d55f)
    #8 0x7fe87867560b in __libc_start_main_alias_1 (/lib64/libc.so.6+0x2d60b)
    #9 0x401654 in _start (/home/vagrant/oss-fuzz/projects/elfutils/elfutils/tests/arextract+0x401654)

Address 0x7ffd5d103c7c is located in stack of thread T0 at offset 284 in frame
    #0 0x7fe87885dbbf in __libelf_next_arhdr_wrlock /home/vagrant/oss-fuzz/projects/elfutils/elfutils/libelf/elf_begin.c:852
```
(In this particular case it's a false positive because in practice it's safe to pass strings like that there though)

Thanks,
Evgeny Vereshchagin
  
Mark Wielaard March 18, 2022, 11:44 a.m. UTC | #2
Hi Evgeny,

On Fri, Mar 18, 2022 at 12:11:50PM +0300, Evgeny Vereshchagin wrote:
> > The ar_size field is a 10 character string, not zero terminated, of
> > decimal digits right padded with spaces.  Make sure it actually starts
> > with a digit before calling atol on it.  We already make sure it is
> > zero terminated. Otherwise atol might produce unexpected results.
> 
> As far as I can tell the patch fixes that particular issue. Thanks! 

Thanks for testing.

> On a somewhat related note, looking at
> https://sourceware.org/bugzilla/show_bug.cgi?id=24085 where
> read_long_names started appending a trailing '\0' to strings without
> trailing spaces only I wonder if it would be better to always append
> trailing zero bytes there? It would make ASan stop complaining about
> read_long_names with ASAN_OPTIONS=strict_string_checks=1 (which is
> supposed to look for places where strings without trailing zeroes
> are passed to functions expecting null-terminated strings).

I guess the idea is that there could be an atoi implementation that
starts from the end of the string? But I think that is super unlikely
since atoi (and strtol) is defined on the initial portion of the
character array. The algorithm is described as working from the start
and once a valid digit is found any non-digit terminates the
algorithm, there seems to be no requirement that that char should be a
zero terminator. So I think that asan strict-string check is not
really correct.

Also since the ar_size is defined as a character array that only
contains digits and (right padded) spaces (but no zero terminator), we
would have to copy the chars always if we would add a zero
terminator. Which is very unlikely (except when the size is larger
than 999999999 bytes, 953 MB.

Cheers,

Mark
  
Evgeny Vereshchagin March 18, 2022, 1:18 p.m. UTC | #3
Hi Mark,

> I guess the idea is that there could be an atoi implementation that
> starts from the end of the string? But I think that is super unlikely
> since atoi (and strtol) is defined on the initial portion of the
> character array. The algorithm is described as working from the start
> and once a valid digit is found any non-digit terminates the
> algorithm, there seems to be no requirement that that char should be a
> zero terminator. So I think that asan strict-string check is not
> really correct.

The idea behind strict_string_checks is to just warn about functions expecting
null-terminated strings that process (potentially) binary data and can in theory get past the end
of the buffers because of that. It just looks for nulls and if they aren't there it complains.
It's off by default because it tends to produce false positives. But I think it's useful sometimes because
for example as far as I can remember it was able to find real heap-buffer-overflows in systemd at some point
and it has been on on the CI there since "string" functions were replaced with functions receiving buffers
and their lengths. Then again, I agree it doesn't seem to make much sense to make ASan happy here.

Thanks,
Evgeny Vereshchagin
  

Patch

diff --git a/libelf/ChangeLog b/libelf/ChangeLog
index 1883af07..07dd905f 100644
--- a/libelf/ChangeLog
+++ b/libelf/ChangeLog
@@ -1,3 +1,7 @@ 
+2022-03-17  Mark Wielaard  <mark@klomp.org>
+
+	* elf_begin.c (read_long_names): Check ar_size starts with a digit.
+
 2022-03-17  Mark Wielaard  <mark@klomp.org>
 
 	* elf_begin.c (get_shnum): Take offset into account for Shdr
diff --git a/libelf/elf_begin.c b/libelf/elf_begin.c
index 03b80185..917e0c71 100644
--- a/libelf/elf_begin.c
+++ b/libelf/elf_begin.c
@@ -765,6 +765,11 @@  read_long_names (Elf *elf)
 	  *((char *) mempcpy (buf, hdr->ar_size, sizeof (hdr->ar_size))) = '\0';
 	  string = buf;
 	}
+
+      /* atol expects to see at least one digit.
+	 It also cannot be negative (-).  */
+      if (!isdigit(string[0]))
+	return NULL;
       len = atol (string);
 
       if (memcmp (hdr->ar_name, "//              ", 16) == 0)