linux: Fix integer overflow warnings when including <sys/mount.h> [BZ #32708]

Message ID 20250217055647.493638-1-collin.funk1@gmail.com (mailing list archive)
State Committed
Commit 3263675250cbcbbcc76ede4f7c660418bd345a11
Headers
Series linux: Fix integer overflow warnings when including <sys/mount.h> [BZ #32708] |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
redhat-pt-bot/TryBot-32bit success Build for i686
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Test passed

Commit Message

Collin Funk Feb. 17, 2025, 5:56 a.m. UTC
  Using gcc -Wshift-overflow=2 -Wsystem-headers to compile a file
including <sys/mount.h> will cause a warning since 1 << 31 is undefined
behavior on platforms where int is 32-bits.

Signed-off-by: Collin Funk <collin.funk1@gmail.com>
---
 sysdeps/unix/sysv/linux/sys/mount.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Cristian Rodríguez Feb. 17, 2025, 12:52 p.m. UTC | #1
On Mon, Feb 17, 2025 at 2:57 AM Collin Funk <collin.funk1@gmail.com> wrote:
>
> Using gcc -Wshift-overflow=2 -Wsystem-headers to compile a file
> including <sys/mount.h> will cause a warning since 1 << 31 is undefined
> behavior on platforms where int is 32-bits.


huh. That's pretty bad.. The same problem exists on the kernel
headers. did you send a kernel patch too?

https://github.com/torvalds/linux/blob/0ad2507d5d93f39619fc42372c347d6006b64319/include/uapi/linux/mount.h#L47
  
Carlos O'Donell March 25, 2025, 4:56 p.m. UTC | #2
On 2/17/25 12:56 AM, Collin Funk wrote:
> Using gcc -Wshift-overflow=2 -Wsystem-headers to compile a file
> including <sys/mount.h> will cause a warning since 1 << 31 is undefined
> behavior on platforms where int is 32-bits.
> 
> Signed-off-by: Collin Funk <collin.funk1@gmail.com>

Tested on i686 and x86_64 with no regressions and this matches a similar termios.h
change we made.

LGTM. I'll push this shortly when my other regression testers complete.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> ---
>   sysdeps/unix/sysv/linux/sys/mount.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sysdeps/unix/sysv/linux/sys/mount.h b/sysdeps/unix/sysv/linux/sys/mount.h
> index 7c6d0805d7..b549e75148 100644
> --- a/sysdeps/unix/sysv/linux/sys/mount.h
> +++ b/sysdeps/unix/sysv/linux/sys/mount.h
> @@ -121,7 +121,7 @@ enum
>     MS_ACTIVE = 1 << 30,
>   #define MS_ACTIVE	MS_ACTIVE
>   #undef MS_NOUSER
> -  MS_NOUSER = 1 << 31
> +  MS_NOUSER = 1U << 31
>   #define MS_NOUSER	MS_NOUSER
>   };
>
  
Florian Weimer March 25, 2025, 6:12 p.m. UTC | #3
* Collin Funk:

> Using gcc -Wshift-overflow=2 -Wsystem-headers to compile a file
> including <sys/mount.h> will cause a warning since 1 << 31 is undefined
> behavior on platforms where int is 32-bits.
>
> Signed-off-by: Collin Funk <collin.funk1@gmail.com>
> ---
>  sysdeps/unix/sysv/linux/sys/mount.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sysdeps/unix/sysv/linux/sys/mount.h b/sysdeps/unix/sysv/linux/sys/mount.h
> index 7c6d0805d7..b549e75148 100644
> --- a/sysdeps/unix/sysv/linux/sys/mount.h
> +++ b/sysdeps/unix/sysv/linux/sys/mount.h
> @@ -121,7 +121,7 @@ enum
>    MS_ACTIVE = 1 << 30,
>  #define MS_ACTIVE	MS_ACTIVE
>  #undef MS_NOUSER
> -  MS_NOUSER = 1 << 31
> +  MS_NOUSER = 1U << 31
>  #define MS_NOUSER	MS_NOUSER
>  };

This change alters the type of all enum constants from int to unsigned
int.  Should we keep the type of the other constants?  Or use INT_MIN
(expanded) for MS_NOUSER?

Thanks,
Florian
  
Collin Funk March 25, 2025, 6:23 p.m. UTC | #4
Florian Weimer <fweimer@redhat.com> writes:

> This change alters the type of all enum constants from int to unsigned
> int.  Should we keep the type of the other constants?  Or use INT_MIN
> (expanded) for MS_NOUSER?

Since mount(2) expects an unsigned long flags, I feel like changing it
to unsigned should be fine. Since the flags are meant to be OR'd
together, I don't think anyone should be using operators like ==, which
might trigger -Wsign-compare or something like that.

Let me know if you disagree.

Collin
  
Carlos O'Donell March 25, 2025, 7:21 p.m. UTC | #5
On 3/25/25 2:23 PM, Collin Funk wrote:
> Florian Weimer <fweimer@redhat.com> writes:
> 
>> This change alters the type of all enum constants from int to unsigned
>> int.  Should we keep the type of the other constants?  Or use INT_MIN
>> (expanded) for MS_NOUSER?
> 
> Since mount(2) expects an unsigned long flags, I feel like changing it
> to unsigned should be fine. Since the flags are meant to be OR'd
> together, I don't think anyone should be using operators like ==, which
> might trigger -Wsign-compare or something like that.

Agreed.

When I reviewed the patch I considered what the impact of the change to
unsigned might have, but I couldn't come up with a real scenario where
it might matter (other than perhaps recording the type in DWARF data
and checking that). I thought about type promotion, and casting, and
other things, but in this context not much matters.

I also considered if the constant should have been 1UL, but it doesn't
make a difference really. In some static scanners and by MISRA C the
"essential type" of 1U is unsigned char and so coverity might complain
for a shift beyond 8. But we're not trying to comply with MISRA C or
coverity here (we'd have a lot to change if we did).
  
> Let me know if you disagree.

Do we have a concrete scenario where it matters?
  
Florian Weimer March 25, 2025, 7:29 p.m. UTC | #6
* Collin Funk:

> Florian Weimer <fweimer@redhat.com> writes:
>
>> This change alters the type of all enum constants from int to unsigned
>> int.  Should we keep the type of the other constants?  Or use INT_MIN
>> (expanded) for MS_NOUSER?
>
> Since mount(2) expects an unsigned long flags, I feel like changing it
> to unsigned should be fine. Since the flags are meant to be OR'd
> together, I don't think anyone should be using operators like ==, which
> might trigger -Wsign-compare or something like that.

I'm not sure if there are any good options here.

The type unsigned long is clearly nominal only, the kernel has to
disregard the upper 32 bits if they exist due to the current sign
extension.

I looked at previous examples, and we generally use 1U << 31 instead of
-2147483647 - 1.  This commit made all enum constants unsigned, too:

commit 4920765eb417431261367bf65f1a8a5ffb78baf7
Author: Ulrich Drepper <drepper@gmail.com>
Date:   Wed Dec 21 22:14:05 2011 -0500

    Define EPOLLONESHOT and EPOLLET using unsigned values

So there is some precedent for what you are doing here.

I believe unsigned enum types are still a GNU extension, but maybe
<sys/mount.h> isn't expected to be processed with compilers without such
extensions.

Thanks,
Florian
  
Collin Funk March 25, 2025, 7:42 p.m. UTC | #7
Florian Weimer <fweimer@redhat.com> writes:

> I believe unsigned enum types are still a GNU extension, but maybe
> <sys/mount.h> isn't expected to be processed with compilers without such
> extensions.

I don't think that unsigned enums are a GNU extension. In C99 section
6.7.2.2 [1]:

    Each enumerated type shall be compatible with char, a signed integer
    type, or an unsigned integer type. The choice of type is
    implementation-defined, 110) but shall be capable of representing the
    values of all the members of the enumeration. The enumerated type is
    incomplete until after the } that terminates the list of enumerator
    declarations.

And subscript 110:

    An implementation may delay the choice of which integer type until
    all enumeration constants have been seen.

This definition taken from Gnulib is a good demonstration of this:

    typedef enum
    {
      _Bool_must_promote_to_int = -1,
      false = 0,
      true = 1
    } _Bool;

Without _Bool_must_promote_to_int the compiler is free to chose
'unsigned int' as the type for the enum, which is unwanted in this case.

Collin

[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
  
Carlos O'Donell March 25, 2025, 7:49 p.m. UTC | #8
On 3/25/25 3:29 PM, Florian Weimer wrote:
> * Collin Funk:
> 
>> Florian Weimer <fweimer@redhat.com> writes:
>>
>>> This change alters the type of all enum constants from int to unsigned
>>> int.  Should we keep the type of the other constants?  Or use INT_MIN
>>> (expanded) for MS_NOUSER?
>>
>> Since mount(2) expects an unsigned long flags, I feel like changing it
>> to unsigned should be fine. Since the flags are meant to be OR'd
>> together, I don't think anyone should be using operators like ==, which
>> might trigger -Wsign-compare or something like that.
> 
> I'm not sure if there are any good options here.
> 
> The type unsigned long is clearly nominal only, the kernel has to
> disregard the upper 32 bits if they exist due to the current sign
> extension.
> 
> I looked at previous examples, and we generally use 1U << 31 instead of
> -2147483647 - 1.  This commit made all enum constants unsigned, too:
> 
> commit 4920765eb417431261367bf65f1a8a5ffb78baf7
> Author: Ulrich Drepper <drepper@gmail.com>
> Date:   Wed Dec 21 22:14:05 2011 -0500
> 
>      Define EPOLLONESHOT and EPOLLET using unsigned values
> 
> So there is some precedent for what you are doing here.

Correct.
  
> I believe unsigned enum types are still a GNU extension, but maybe
> <sys/mount.h> isn't expected to be processed with compilers without such
> extensions.

Before C23 they are.

As of C23 they are not (which allows underlying type specification).

Example:
test.c:5:14: error: ISO C restricts enumerator values to range of ‘int’ before C23 [-Wpedantic]
     5 |    val3    = 0xFFFFFFF0
       |              ^~~~~~~~~~
  
Florian Weimer March 25, 2025, 7:54 p.m. UTC | #9
* Collin Funk:

> Florian Weimer <fweimer@redhat.com> writes:
>
>> I believe unsigned enum types are still a GNU extension, but maybe
>> <sys/mount.h> isn't expected to be processed with compilers without such
>> extensions.
>
> I don't think that unsigned enums are a GNU extension. In C99 section
> 6.7.2.2 [1]:
>
>     Each enumerated type shall be compatible with char, a signed integer
>     type, or an unsigned integer type. The choice of type is
>     implementation-defined, 110) but shall be capable of representing the
>     values of all the members of the enumeration. The enumerated type is
>     incomplete until after the } that terminates the list of enumerator
>     declarations.

That's about the type.  About the initializers, it says a bit earlier:

| The expression that defines the value of an enumeration constant shall
| be an integer constant expression that has a value representable as an
| int.

For glibc targets, 1U << 31 is not representable as an int.

Thanks,
Florian
  
Carlos O'Donell March 25, 2025, 8:09 p.m. UTC | #10
On 3/25/25 3:42 PM, Collin Funk wrote:
> Florian Weimer <fweimer@redhat.com> writes:
> 
>> I believe unsigned enum types are still a GNU extension, but maybe
>> <sys/mount.h> isn't expected to be processed with compilers without such
>> extensions.
> 
> I don't think that unsigned enums are a GNU extension. In C99 section
> 6.7.2.2 [1]:

I think the correct question to ask is if the value is representable
without an extension.

You quote 6.7.2.2 # 4

>      Each enumerated type shall be compatible with char, a signed integer
>      type, or an unsigned integer type. The choice of type is
>      implementation-defined, 110) but shall be capable of representing the
>      values of all the members of the enumeration. The enumerated type is
>      incomplete until after the } that terminates the list of enumerator
>      declarations.

There is an earlier 6.7.2.2 # 2 (just saw Florian post the same)

        The expression that defines the value of an enumeration constant
        shall be an integer constant expression that has a value representable
        as an int.

Which limits the value representable.
  
> And subscript 110:
> 
>      An implementation may delay the choice of which integer type until
>      all enumeration constants have been seen.

While this is true, the current value in the sign bit can't be represented
and so is invalid ISO C before C23 unless you have an extension.

> This definition taken from Gnulib is a good demonstration of this:
> 
>      typedef enum
>      {
>        _Bool_must_promote_to_int = -1,
>        false = 0,
>        true = 1
>      } _Bool;
> 
> Without _Bool_must_promote_to_int the compiler is free to chose
> 'unsigned int' as the type for the enum, which is unwanted in this case.

In summary:

- The current value of various header constants of the form 1U << 31 are
   technically not representable without a compiler language extension
   because before C23 this wasn't representable.

- I still think using "1U << 31" is the clearest possible solution and
   follows established precedent. You need a compiler that supports
   unsigned enums and the full representable range to compile code in
   the ecosystem.
  
Collin Funk March 25, 2025, 8:42 p.m. UTC | #11
Carlos O'Donell <carlos@redhat.com> writes:

> - I still think using "1U << 31" is the clearest possible solution and
>   follows established precedent. You need a compiler that supports
>   unsigned enums and the full representable range to compile code in
>   the ecosystem.

I agree, especially since you found the usage in other places like
<sys/epoll.h> and no one has had a problem with it, as far as I can
tell.

Thanks all,
Collin
  
Florian Weimer March 25, 2025, 9:32 p.m. UTC | #12
* Carlos O'Donell:

> - I still think using "1U << 31" is the clearest possible solution and
>   follows established precedent. You need a compiler that supports
>   unsigned enums and the full representable range to compile code in
>   the ecosystem.

I tend to agree for <sys/mount.h>.  In things like <math.h>, we should
aim for stricter conformance, but here it is okay.

Thanks,
Florian
  
Carlos O'Donell March 26, 2025, 3:58 a.m. UTC | #13
On 3/25/25 5:32 PM, Florian Weimer wrote:
> * Carlos O'Donell:
> 
>> - I still think using "1U << 31" is the clearest possible solution and
>>    follows established precedent. You need a compiler that supports
>>    unsigned enums and the full representable range to compile code in
>>    the ecosystem.
> 
> I tend to agree for <sys/mount.h>.  In things like <math.h>, we should
> aim for stricter conformance, but here it is okay.

Absolutely, for <math.h> I think we need to be much more cautious.

Given that I see consensus from you and Collin I've pushed the changed.
  

Patch

diff --git a/sysdeps/unix/sysv/linux/sys/mount.h b/sysdeps/unix/sysv/linux/sys/mount.h
index 7c6d0805d7..b549e75148 100644
--- a/sysdeps/unix/sysv/linux/sys/mount.h
+++ b/sysdeps/unix/sysv/linux/sys/mount.h
@@ -121,7 +121,7 @@  enum
   MS_ACTIVE = 1 << 30,
 #define MS_ACTIVE	MS_ACTIVE
 #undef MS_NOUSER
-  MS_NOUSER = 1 << 31
+  MS_NOUSER = 1U << 31
 #define MS_NOUSER	MS_NOUSER
 };