Fix computations with (potentially) NULL pointer

Message ID 20231113225835.4083255-2-ppluzhnikov@google.com
State Committed
Headers
Series Fix computations with (potentially) NULL pointer |

Commit Message

Paul Pluzhnikov Nov. 13, 2023, 10:58 p.m. UTC
  When map_address is NULL, computing map_address+offset is technically
undefined behavior, and triggers Clang/LLVM warning when using
-fsanitize=pointer-overflow.

Fix this by using uintptr_t to perform computations.

Signed-off-by: Shahriar "Nafi" Rouf <nafi@google.com>
---
 libelf/elf_begin.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
  

Comments

Mark Wielaard Nov. 14, 2023, 12:57 p.m. UTC | #1
Hi Paul, Hi Nafi,

On Mon, 2023-11-13 at 22:58 +0000, Paul Pluzhnikov wrote:
> When map_address is NULL, computing map_address+offset is technically
> undefined behavior, and triggers Clang/LLVM warning when using
> -fsanitize=pointer-overflow.

Urgh, I had no idea NULL + ... was technically undefined behavior.

> Fix this by using uintptr_t to perform computations.

I can see how this solves the the issue. It would be slightly nicer if
we could just do the computation after checking map_address != NULL
(since ehdr is only used after such a check). That would require
rearranging some of the if statements. Does that make the code too
complicated?

Also this only resolves the issue for the 64bit ELF case. Just above
this code is basically the same code for 32bit ELF. That code also
needs to be fixed.

Thanks,

Mark
  
Paul Pluzhnikov Nov. 14, 2023, 4:12 p.m. UTC | #2
Mark,

On Tue, Nov 14, 2023 at 4:57 AM Mark Wielaard <mark@klomp.org> wrote:

> Urgh, I had no idea NULL + ... was technically undefined behavior.

ISO/IEC 9899:201x
6.5.6p8

When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integer expression.
...  If both the pointer operand and the result point to elements of
the same array object, or one past the last element of the array
object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.

> It would be slightly nicer if
> we could just do the computation after checking map_address != NULL
> (since ehdr is only used after such a check). That would require
> rearranging some of the if statements. Does that make the code too
> complicated?

I tried it, and it does: we need both "map_addr != 0" and "ehdr is
properly aligned", but we can't compute the latter before evaluating
the former, and we have the else clause when either condition fails. I
can fix this with a goto, or a helper variable, but the current
solution of keeping ehdr as uintptr_t is much simpler.

It also reduces the casting and line wrapping required :-)

> Also this only resolves the issue for the 64bit ELF case. Just above
> this code is basically the same code for 32bit ELF. That code also
> needs to be fixed.

Sorry I missed that.

Revised patch attached.

Thanks,
  
Mark Wielaard Nov. 14, 2023, 4:57 p.m. UTC | #3
Hi Paul,

On Tue, 2023-11-14 at 08:12 -0800, Paul Pluzhnikov wrote:
> On Tue, Nov 14, 2023 at 4:57 AM Mark Wielaard <mark@klomp.org> wrote:
> 
> > Urgh, I had no idea NULL + ... was technically undefined behavior.
> 
> ISO/IEC 9899:201x
> 6.5.6p8
> 
> When an expression that has integer type is added to or subtracted
> from a pointer, the result has the type of the pointer operand. If the
> pointer operand points to an element of an array object, and the array
> is large enough, the result points to an element offset from the
> original element such that the difference of the subscripts of the
> resulting and original array elements equals the integer expression.
> ...  If both the pointer operand and the result point to elements of
> the same array object, or one past the last element of the array
> object, the evaluation shall not produce an overflow; otherwise, the
> behavior is undefined.

Aha, so since NULL isn't a pointer to an array object a compiler may
assume the pointer isn't NULL and possibly remove any NULL checks later
in the code if it sees you do pointer arithmetic... boo.

> > It would be slightly nicer if
> > we could just do the computation after checking map_address != NULL
> > (since ehdr is only used after such a check). That would require
> > rearranging some of the if statements. Does that make the code too
> > complicated?
> 
> I tried it, and it does: we need both "map_addr != 0" and "ehdr is
> properly aligned", but we can't compute the latter before evaluating
> the former, and we have the else clause when either condition fails. I
> can fix this with a goto, or a helper variable, but the current
> solution of keeping ehdr as uintptr_t is much simpler.
> 
> It also reduces the casting and line wrapping required :-)

Yeah, I was afraid of that. Thanks for looking into it though.

> > Also this only resolves the issue for the 64bit ELF case. Just above
> > this code is basically the same code for 32bit ELF. That code also
> > needs to be fixed.
> 
> Sorry I missed that.

I am slightly surprised our testsuite didn't catch this. We do have --
enable-sanitize-undefined which does build everything with --
sanitize=undefined. Which should enable -fsanitize=pointer-overflow.
But I just tried (with gcc) and it seems to not trigger.

> Revised patch attached.

Looks good. Applied.

Thanks,

Mark
  
Paul Pluzhnikov Nov. 14, 2023, 5:04 p.m. UTC | #4
Mark,

On Tue, Nov 14, 2023 at 8:57 AM Mark Wielaard <mark@klomp.org> wrote:

> I am slightly surprised our testsuite didn't catch this. We do have --
> enable-sanitize-undefined which does build everything with --
> sanitize=undefined. Which should enable -fsanitize=pointer-overflow.
> But I just tried (with gcc) and it seems to not trigger.

This was exposed by Clang (and a close to HEAD Clang at that -- I am
not sure whether released Clang also catches this).

> Looks good. Applied.

Thanks!

Appreciate the speedy reviews.
  
Mark Wielaard Nov. 14, 2023, 5:55 p.m. UTC | #5
Hi Paul,

On Tue, 2023-11-14 at 09:04 -0800, Paul Pluzhnikov wrote:
> On Tue, Nov 14, 2023 at 8:57 AM Mark Wielaard <mark@klomp.org> wrote:
> > Looks good. Applied.
> 
> Thanks!
> 
> Appreciate the speedy reviews.

Unfortunately our 32bit buildbots were also very quick to point out an
issue: https://builder.sourceware.org/buildbot/#/changes/35202
elfutils-debian-i386 and elfutils-debian-armhf fail to compile the same
way:

elf_begin.c: In function ‘file_read_elf’:
elf_begin.c:495:30: error: cast to pointer from integer of different
size [-Werror=int-to-pointer-cast]
      elf->state.elf64.shdr = (Elf64_Shdr *) (ehdr + e_shoff);
                              ^
cc1: all warnings being treated as errors

This is I think because e_shoff is an Elf64_Off, so (ehdr + e_shoff)
gets promoted to 64bits...

Which does expose an interesting issue that (theoretically) mmaped
64bit Elf files cannot be used on 32bit architectures... hohum.

I don't immediately know what the correct fix is here. I'll think about
it over dinner.

Cheers,

Mark
  
Paul Pluzhnikov Nov. 14, 2023, 6:56 p.m. UTC | #6
On Tue, Nov 14, 2023 at 9:55 AM Mark Wielaard <mark@klomp.org> wrote:

> Unfortunately our 32bit buildbots were also very quick to point out an
> issue: https://builder.sourceware.org/buildbot/#/changes/35202

Sorry about the break.

I just tried "./configure "CC=gcc -m32" "CXX=g++ -m32" and that didn't
reproduce the failure.

> Which does expose an interesting issue that (theoretically) mmaped
> 64bit Elf files cannot be used on 32bit architectures... hohum.

The failure here would be when map_addr ends up high in memory, and
e_shoff is so large that it causes a wrap around.

Section headers do tend to be near the end of the ELF. One of our
large 64-bit binaries (3.6GiB in size) has e_shoff == 3803727624, so
the overflow seems very likely here ... except the mmap would have
failed already, because the mmap covers the entire file. So I think
the overflow can not happen in practice.

If that's true, we can cast e_shoff to ptrdiff_t to suppress the warning.

diff --git a/libelf/elf_begin.c b/libelf/elf_begin.c
index 9f8196b6..dcaad8ee 100644
--- a/libelf/elf_begin.c
+++ b/libelf/elf_begin.c
@@ -492,7 +492,7 @@ file_read_elf (int fildes, void *map_address,
unsigned char *e_ident,
            goto free_and_out;

          if (scncnt > 0)
-           elf->state.elf64.shdr = (Elf64_Shdr *) (ehdr + e_shoff);
+           elf->state.elf64.shdr = (Elf64_Shdr *) (ehdr + (ptrdiff_t) e_shoff);

          for (size_t cnt = 0; cnt < scncnt; ++cnt)
            {
  
Mark Wielaard Nov. 14, 2023, 8:52 p.m. UTC | #7
Hi Paul,

On Tue, Nov 14, 2023 at 10:56:50AM -0800, Paul Pluzhnikov wrote:
> On Tue, Nov 14, 2023 at 9:55 AM Mark Wielaard <mark@klomp.org> wrote:
> 
> > Unfortunately our 32bit buildbots were also very quick to point out an
> > issue: https://builder.sourceware.org/buildbot/#/changes/35202
> 
> Sorry about the break.

No worries. That is what the bots are for :)

> I just tried "./configure "CC=gcc -m32" "CXX=g++ -m32" and that didn't
> reproduce the failure.

Are you sure? It does for me. It also confirmed your suggestion below
fixes it.

> > Which does expose an interesting issue that (theoretically) mmaped
> > 64bit Elf files cannot be used on 32bit architectures... hohum.
> 
> The failure here would be when map_addr ends up high in memory, and
> e_shoff is so large that it causes a wrap around.
> 
> Section headers do tend to be near the end of the ELF. One of our
> large 64-bit binaries (3.6GiB in size) has e_shoff == 3803727624, so
> the overflow seems very likely here ... except the mmap would have
> failed already, because the mmap covers the entire file. So I think
> the overflow can not happen in practice.

O, good observation. You are right.

> If that's true, we can cast e_shoff to ptrdiff_t to suppress the warning.

Thanks. I pushed that fix as attached.

Cheers,

Mark
From f84f1cd7e296bf223cb45b5469978d4bea82cec0 Mon Sep 17 00:00:00 2001
From: Mark Wielaard <mark@klomp.org>
Date: Tue, 14 Nov 2023 21:34:50 +0100
Subject: [PATCH] libelf: Fix elf_begin.c build on 32bit arches.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

On 32bit architectures gcc produces an error:

elf_begin.c: In function ‘file_read_elf’:
elf_begin.c:495:30: error: cast to pointer from integer of different
size [-Werror=int-to-pointer-cast]
      elf->state.elf64.shdr = (Elf64_Shdr *) (ehdr + e_shoff);
                              ^

This is because we are adding an uintptr to an Elf64_Off which
promotes the result to a 64bit value. Fix this by casting the
e_shoff to an ptrdiff_t. This is fine since the mmap of the file
would have failed if it didn't fit in the 32bit address space
and we check that e_shoff fits inside the image.

	* libelf/elf_begin.c (file_read_elf): Cast e_shoff to ptrdiff_t
	before adding to ehdr.

Suggested-by: Paul Pluzhnikov <ppluzhnikov@google.com>
Signed-off-by: Mark Wielaard <mark@klomp.org>
---
 libelf/elf_begin.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libelf/elf_begin.c b/libelf/elf_begin.c
index 9f8196b6..dcaad8ee 100644
--- a/libelf/elf_begin.c
+++ b/libelf/elf_begin.c
@@ -492,7 +492,7 @@ file_read_elf (int fildes, void *map_address, unsigned char *e_ident,
 	    goto free_and_out;
 
 	  if (scncnt > 0)
-	    elf->state.elf64.shdr = (Elf64_Shdr *) (ehdr + e_shoff);
+	    elf->state.elf64.shdr = (Elf64_Shdr *) (ehdr + (ptrdiff_t) e_shoff);
 
 	  for (size_t cnt = 0; cnt < scncnt; ++cnt)
 	    {
  

Patch

diff --git a/libelf/elf_begin.c b/libelf/elf_begin.c
index fe8c640a..da495bef 100644
--- a/libelf/elf_begin.c
+++ b/libelf/elf_begin.c
@@ -445,15 +445,15 @@  file_read_elf (int fildes, void *map_address, unsigned char *e_ident,
     {
       /* This pointer might not be directly usable if the alignment is
 	 not sufficient for the architecture.  */
-      Elf64_Ehdr *ehdr = (Elf64_Ehdr *) ((char *) map_address + offset);
+      uintptr_t ehdr = (uintptr_t) map_address + offset;
 
       /* This is a 64-bit binary.  */
       if (map_address != NULL && e_ident[EI_DATA] == MY_ELFDATA
 	  && (ALLOW_UNALIGNED
-	      || (((uintptr_t) ehdr) & (__alignof__ (Elf64_Ehdr) - 1)) == 0))
+	      || (ehdr & (__alignof__ (Elf64_Ehdr) - 1)) == 0))
 	{
 	  /* We can use the mmapped memory.  */
-	  elf->state.elf64.ehdr = ehdr;
+	  elf->state.elf64.ehdr = (Elf64_Ehdr *) ehdr;
 	}
       else
 	{
@@ -486,7 +486,7 @@  file_read_elf (int fildes, void *map_address, unsigned char *e_ident,
       if (map_address != NULL && e_ident[EI_DATA] == MY_ELFDATA
 	  && cmd != ELF_C_READ_MMAP /* We need a copy to be able to write.  */
 	  && (ALLOW_UNALIGNED
-	      || ((((uintptr_t) ehdr + e_shoff)
+	      || (((ehdr + e_shoff)
 		   & (__alignof__ (Elf64_Shdr) - 1)) == 0)))
 	{
 	  if (unlikely (scncnt > 0 && e_shoff >= maxsize)
@@ -496,7 +496,7 @@  file_read_elf (int fildes, void *map_address, unsigned char *e_ident,
 
 	  if (scncnt > 0)
 	    elf->state.elf64.shdr
-	      = (Elf64_Shdr *) ((char *) ehdr + e_shoff);
+	      = (Elf64_Shdr *) (ehdr + e_shoff);
 
 	  for (size_t cnt = 0; cnt < scncnt; ++cnt)
 	    {