uapi: Make __{u,s}64 match {u,}int64_t in userspace

Message ID YZvIlz7J6vOEY+Xu@yuki
State Not applicable
Headers
Series uapi: Make __{u,s}64 match {u,}int64_t in userspace |

Checks

Context Check Description
dj/TryBot-apply_patch fail Patch failed to apply to master at the time it was sent
dj/TryBot-32bit fail Patch series failed to apply

Commit Message

Cyril Hrubis Nov. 22, 2021, 4:43 p.m. UTC
  This changes the __u64 and __s64 in userspace on 64bit platforms from
long long (unsigned) int to just long (unsigned) int in order to match
the uint64_t and int64_t size in userspace.

This allows us to use the kernel structure definitions in userspace. For
example we can use PRIu64 and PRId64 modifiers in printf() to print
structure members. Morever with this there would be no need to redefine
these structures in libc implementations as it is done now.

Consider for example the newly added statx() syscall. If we use the
header from uapi we will get warnings when attempting to print it's
members as:

	printf("%" PRIu64 "\n", stx.stx_size);

We get:

	warning: format '%lu' expects argument of type 'long unsigned int',
	         but argument 5 has type '__u64' {aka 'long long unsigned int'}

After this patch the types match and no warnings are generated.

Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
---
 include/uapi/asm-generic/types.h | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)
  

Comments

Alejandro Colomar \(mailing lists\) Nov. 22, 2021, 4:51 p.m. UTC | #1
Hi Cyril,

On 11/22/21 17:43, Cyril Hrubis wrote:
> This changes the __u64 and __s64 in userspace on 64bit platforms from
> long long (unsigned) int to just long (unsigned) int in order to match
> the uint64_t and int64_t size in userspace.
> 
> This allows us to use the kernel structure definitions in userspace. For
> example we can use PRIu64 and PRId64 modifiers in printf() to print
> structure members. Morever with this there would be no need to redefine
> these structures in libc implementations as it is done now.
> 
> Consider for example the newly added statx() syscall. If we use the
> header from uapi we will get warnings when attempting to print it's
> members as:
> 
> 	printf("%" PRIu64 "\n", stx.stx_size);
> 
> We get:
> 
> 	warning: format '%lu' expects argument of type 'long unsigned int',
> 	         but argument 5 has type '__u64' {aka 'long long unsigned int'}
> 
> After this patch the types match and no warnings are generated.
This would make it even easier to ignore the existence of different 
kernel types, and let userspace use standard types.

Related recent discussion:
<https://lore.kernel.org/linux-man/20210423230609.13519-1-alx.manpages@gmail.com/>

> 
> Signed-off-by: Cyril Hrubis <chrubis@suse.cz>

Acked-by: Alejandro Colomar <alx.manpages@gmail.com>

Thanks,
Alex

> ---
>   include/uapi/asm-generic/types.h | 11 +++++++++--
>   1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/asm-generic/types.h b/include/uapi/asm-generic/types.h
> index dfaa50d99d8f..ae748a3678a4 100644
> --- a/include/uapi/asm-generic/types.h
> +++ b/include/uapi/asm-generic/types.h
> @@ -1,9 +1,16 @@
>   /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>   #ifndef _ASM_GENERIC_TYPES_H
>   #define _ASM_GENERIC_TYPES_H
> +
> +#include <asm/bitsperlong.h>
> +
>   /*
> - * int-ll64 is used everywhere now.
> + * int-ll64 is used everywhere in kernel now.
>    */
> -#include <asm-generic/int-ll64.h>
> +#if __BITS_PER_LONG == 64 && !defined(__KERNEL__)

BTW, C2X adds LONG_WIDTH in <limits.h> (and in general TYPE_WIDTH) to 
get the bits of a long.

> +# include <asm-generic/int-l64.h>
> +#else
> +# include <asm-generic/int-ll64.h>
> +#endif
>   
>   #endif /* _ASM_GENERIC_TYPES_H */
>
  
Arnd Bergmann Nov. 22, 2021, 8:48 p.m. UTC | #2
On Mon, Nov 22, 2021 at 5:43 PM Cyril Hrubis <chrubis@suse.cz> wrote:
>
> +
> +#include <asm/bitsperlong.h>
> +
>  /*
> - * int-ll64 is used everywhere now.
> + * int-ll64 is used everywhere in kernel now.
>   */
> -#include <asm-generic/int-ll64.h>
> +#if __BITS_PER_LONG == 64 && !defined(__KERNEL__)
> +# include <asm-generic/int-l64.h>
> +#else
> +# include <asm-generic/int-ll64.h>
> +#endif

I don't think this is correct on all 64-bit architectures, as far as I
remember the
definition can use either 'long' or 'long long' depending on the user space
toolchain.

Out of the ten supported 64-bit architectures, there are four that already
use asm-generic/int-l64.h conditionally, and six that don't, and I
think at least
some of those are intentional.

I think it would be safer to do this one architecture at a time to make
sure this doesn't regress on those that require the int-ll64.h version.

There should also be a check for __SANE_USERSPACE_TYPES__
to let userspace ask for the ll64 version everywhere.

          Arnd
  
Zack Weinberg Nov. 22, 2021, 10:19 p.m. UTC | #3
On Mon, Nov 22, 2021, at 4:43 PM, Cyril Hrubis wrote:
> This changes the __u64 and __s64 in userspace on 64bit platforms from
> long long (unsigned) int to just long (unsigned) int in order to match
> the uint64_t and int64_t size in userspace.
...
> +
> +#include <asm/bitsperlong.h>
> +
>  /*
> - * int-ll64 is used everywhere now.
> + * int-ll64 is used everywhere in kernel now.
>   */
> -#include <asm-generic/int-ll64.h>
> +#if __BITS_PER_LONG == 64 && !defined(__KERNEL__)
> +# include <asm-generic/int-l64.h>
> +#else
> +# include <asm-generic/int-ll64.h>
> +#endif

I am all for matching __uN / __sN to uintN_t / intN_t in userspace, but may I suggest the technically simpler and guaranteed-to-be-accurate

 /*
- * int-ll64 is used everywhere now.
+ * int-ll64 is used everywhere in kernel now.
+ * In user space match <stdint.h>.
  */
+#ifdef __KERNEL__
 # include <asm-generic/int-ll64.h>
+#elif __has_include (<bits/types.h>)
+# include <bits/types.h>
+typedef __int8_t __s8;
+typedef __uint8_t __u8;
+typedef __int16_t __s16;
+typedef __uint16_t __u16;
+typedef __int32_t __s32;
+typedef __uint32_t __u32;
+typedef __int64_t __s64;
+typedef __uint64_t __u64;
+#else
+# include <stdint.h>
+typedef int8_t __s8;
+typedef uint8_t __u8;
+typedef int16_t __s16;
+typedef uint16_t __u16;
+typedef int32_t __s32;
+typedef uint32_t __u32;
+typedef int64_t __s64;
+typedef uint64_t __u64;
+#endif

The middle clause could be dropped if we are okay with all uapi headers potentially exposing the non-implementation-namespace names defined by <stdint.h>.  I do not know what the musl libc equivalent of <bits/types.h> is.

zw
  
Cyril Hrubis Nov. 23, 2021, 9:14 a.m. UTC | #4
Hi!
> > +#include <asm/bitsperlong.h>
> > +
> >  /*
> > - * int-ll64 is used everywhere now.
> > + * int-ll64 is used everywhere in kernel now.
> >   */
> > -#include <asm-generic/int-ll64.h>
> > +#if __BITS_PER_LONG == 64 && !defined(__KERNEL__)
> > +# include <asm-generic/int-l64.h>
> > +#else
> > +# include <asm-generic/int-ll64.h>
> > +#endif
> 
> I don't think this is correct on all 64-bit architectures, as far as I
> remember the
> definition can use either 'long' or 'long long' depending on the user space
> toolchain.

As far as I can tell the userspace bits/types.h does exactly the same
check in order to define uint64_t and int64_t, i.e.:

#if __WORDSIZE == 64
typedef signed long int __int64_t;
typedef unsigned long int __uint64_t;
#else
__extension__ typedef signed long long int __int64_t;
__extension__ typedef unsigned long long int __uint64_t;
#endif

The macro __WORDSIZE is defined per architecture, and it looks like the
defintions in glibc sources in bits/wordsize.h match the uapi
asm/bitsperlong.h. But I may have missed something, the code in glibc is
not exactly easy to read.

> Out of the ten supported 64-bit architectures, there are four that already
> use asm-generic/int-l64.h conditionally, and six that don't, and I
> think at least
> some of those are intentional.
>
> I think it would be safer to do this one architecture at a time to make
> sure this doesn't regress on those that require the int-ll64.h version.

I'm still trying to understand what exactly can go wrong here. As long
as __BITS_PER_LONG is correctly defined the __u64 and __s64 will be
correctly sized as well. The only visible change is that one 'long' is
dropped from the type when it's not needed.

> There should also be a check for __SANE_USERSPACE_TYPES__
> to let userspace ask for the ll64 version everywhere.

That one is easy to fix at least.
  
Cyril Hrubis Nov. 23, 2021, 9:15 a.m. UTC | #5
Hi!
> I am all for matching __uN / __sN to uintN_t / intN_t in userspace, but may I suggest the technically simpler and guaranteed-to-be-accurate
> 
>  /*
> - * int-ll64 is used everywhere now.
> + * int-ll64 is used everywhere in kernel now.
> + * In user space match <stdint.h>.
>   */
> +#ifdef __KERNEL__
>  # include <asm-generic/int-ll64.h>
> +#elif __has_include (<bits/types.h>)
> +# include <bits/types.h>
> +typedef __int8_t __s8;
> +typedef __uint8_t __u8;
> +typedef __int16_t __s16;
> +typedef __uint16_t __u16;
> +typedef __int32_t __s32;
> +typedef __uint32_t __u32;
> +typedef __int64_t __s64;
> +typedef __uint64_t __u64;
> +#else
> +# include <stdint.h>
> +typedef int8_t __s8;
> +typedef uint8_t __u8;
> +typedef int16_t __s16;
> +typedef uint16_t __u16;
> +typedef int32_t __s32;
> +typedef uint32_t __u32;
> +typedef int64_t __s64;
> +typedef uint64_t __u64;
> +#endif
> 
> The middle clause could be dropped if we are okay with all uapi headers potentially exposing the non-implementation-namespace names defined by <stdint.h>.  I do not know what the musl libc equivalent of <bits/types.h> is.

If it's okay to depend on a header defined by a libc this is better
solution.
  
Arnd Bergmann Nov. 23, 2021, 2:18 p.m. UTC | #6
On Tue, Nov 23, 2021 at 10:14 AM Cyril Hrubis <chrubis@suse.cz> wrote:
> > I don't think this is correct on all 64-bit architectures, as far as I
> > remember the
> > definition can use either 'long' or 'long long' depending on the user space
> > toolchain.
>
> As far as I can tell the userspace bits/types.h does exactly the same
> check in order to define uint64_t and int64_t, i.e.:
>
> #if __WORDSIZE == 64
> typedef signed long int __int64_t;
> typedef unsigned long int __uint64_t;
> #else
> __extension__ typedef signed long long int __int64_t;
> __extension__ typedef unsigned long long int __uint64_t;
> #endif
>
> The macro __WORDSIZE is defined per architecture, and it looks like the
> defintions in glibc sources in bits/wordsize.h match the uapi
> asm/bitsperlong.h. But I may have missed something, the code in glibc is
> not exactly easy to read.

It's possible that the only difference between the two files was the
'__u32'/'__s32' definition, which could be either 'int' or 'long'. We used
to try matching the user space types for these, but not use 'int'
everywhere in the kernel.

> > Out of the ten supported 64-bit architectures, there are four that already
> > use asm-generic/int-l64.h conditionally, and six that don't, and I
> > think at least
> > some of those are intentional.
> >
> > I think it would be safer to do this one architecture at a time to make
> > sure this doesn't regress on those that require the int-ll64.h version.
>
> I'm still trying to understand what exactly can go wrong here. As long
> as __BITS_PER_LONG is correctly defined the __u64 and __s64 will be
> correctly sized as well. The only visible change is that one 'long' is
> dropped from the type when it's not needed.

Correct, I'm not worried about getting incorrectly-sized types here,
but using the wrong type can cause compile-time warnings when
they are mismatched against format strings or assigning pointers
to the wrong types. With the kernel types, one would always use
%d for __u32 and %lld for __u64, while with the user space types,
one has to resort to using macros.

       Arnd
  
David Howells Nov. 23, 2021, 4:47 p.m. UTC | #7
Cyril Hrubis <chrubis@suse.cz> wrote:

> This changes the __u64 and __s64 in userspace on 64bit platforms from
> long long (unsigned) int to just long (unsigned) int in order to match
> the uint64_t and int64_t size in userspace.

Can you guarantee this won't break anything in userspace?  Granted the types
*ought* to be the same size, but anyone who's written code on the basis that
these are "(unsigned) long long int" may suddenly get a bunch of warnings
where they didn't before from the C compiler.  Anyone using C++, say, may find
their code no longer compiles because overloaded function matching no longer
finds a correct match.

Also, whilst your point about PRIu64 and PRId64 modifiers in printf() is a
good one, it doesn't help someone whose compiler doesn't support that (I don't
know if anyone's likely to encounter such these days).  At the moment, I think
a user can assume that %llu will work correctly both on 32-bit and 64-bit on
all arches, but you're definitely breaking that assumption.

David
  
David Laight Nov. 23, 2021, 4:58 p.m. UTC | #8
From: David Howells
> Sent: 23 November 2021 16:48
> 
> Cyril Hrubis <chrubis@suse.cz> wrote:
> 
> > This changes the __u64 and __s64 in userspace on 64bit platforms from
> > long long (unsigned) int to just long (unsigned) int in order to match
> > the uint64_t and int64_t size in userspace.

That is a massive UAPI change you can't do.

> Can you guarantee this won't break anything in userspace?  Granted the types
> *ought* to be the same size, but anyone who's written code on the basis that
> these are "(unsigned) long long int" may suddenly get a bunch of warnings
> where they didn't before from the C compiler.  Anyone using C++, say, may find
> their code no longer compiles because overloaded function matching no longer
> finds a correct match.

Indeed

> Also, whilst your point about PRIu64 and PRId64 modifiers in printf() is a
> good one, it doesn't help someone whose compiler doesn't support that (I don't
> know if anyone's likely to encounter such these days).  At the moment, I think
> a user can assume that %llu will work correctly both on 32-bit and 64-bit on
> all arches, but you're definitely breaking that assumption.

PRIu64 (etc) don't require compiler support, just the correct header file.

I'm pretty sure that portable user code needs to allow for int64_t being
either 'long' or 'long long' on 64bit architectures.
(Indeed 'long' may not even be 64bit.)

On 32bit int32_t can definitely be either 'int' of 'long' at whim.

I'm not sure anyone has tried a 64bit long with a 128bit long-long.
But the C language might lead you to do that.

Of course, fully portable code has to allow for char, short, int and long
all being the same size!

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
  
Florian Weimer Nov. 23, 2021, 7:50 p.m. UTC | #9
* Cyril Hrubis:

> As far as I can tell the userspace bits/types.h does exactly the same
> check in order to define uint64_t and int64_t, i.e.:
>
> #if __WORDSIZE == 64
> typedef signed long int __int64_t;
> typedef unsigned long int __uint64_t;
> #else
> __extension__ typedef signed long long int __int64_t;
> __extension__ typedef unsigned long long int __uint64_t;
> #endif
>
> The macro __WORDSIZE is defined per architecture, and it looks like the
> defintions in glibc sources in bits/wordsize.h match the uapi
> asm/bitsperlong.h. But I may have missed something, the code in glibc is
> not exactly easy to read.

__WORDSIZE isn't exactly a standard libc macro.

On musl, x86-64 x32 has __WORDSIZE == 64 depending on header-inclusion
order, but that's probably just a bug.

Thanks,
Florian
  
Alejandro Colomar Nov. 24, 2021, 10:17 a.m. UTC | #10
On 11/23/21 20:50, Florian Weimer via Libc-alpha wrote:
> * Cyril Hrubis:
> 
>> As far as I can tell the userspace bits/types.h does exactly the same
>> check in order to define uint64_t and int64_t, i.e.:
>>
>> #if __WORDSIZE == 64
>> typedef signed long int __int64_t;
>> typedef unsigned long int __uint64_t;
>> #else
>> __extension__ typedef signed long long int __int64_t;
>> __extension__ typedef unsigned long long int __uint64_t;
>> #endif
>>
>> The macro __WORDSIZE is defined per architecture, and it looks like the
>> defintions in glibc sources in bits/wordsize.h match the uapi
>> asm/bitsperlong.h. But I may have missed something, the code in glibc is
>> not exactly easy to read.
> 
> __WORDSIZE isn't exactly a standard libc macro.

The (to-be) standard libc macro would be LONG_WIDTH (although it has a 
slightly different meaning, but it can be used for this, but then the 
code also needs to expose <limits.h>), rigth?

Regards,
Alex
  
Cyril Hrubis Nov. 29, 2021, 11:58 a.m. UTC | #11
Hi!
> > > This changes the __u64 and __s64 in userspace on 64bit platforms from
> > > long long (unsigned) int to just long (unsigned) int in order to match
> > > the uint64_t and int64_t size in userspace.
> 
> That is a massive UAPI change you can't do.

Understood.

However it can still be changed if user asks for it explicitly right?

What about guarding the change with __STDINT_COMPATIBLE_TYPES__ as:

#if defined(__STDINT_COMPATIBLE_TYPES__)
# include <stdint.h>

typedef __u64 uint64_t;
...

#else
# include <asm-generic/int-ll64.h>
#endif
  
Arnd Bergmann Nov. 29, 2021, 2:34 p.m. UTC | #12
On Mon, Nov 29, 2021 at 12:58 PM Cyril Hrubis <chrubis@suse.cz> wrote:
>
> What about guarding the change with __STDINT_COMPATIBLE_TYPES__ as:
>
> #if defined(__STDINT_COMPATIBLE_TYPES__)
> # include <stdint.h>
>
> typedef __u64 uint64_t;
> ...

I don't think we can include stdint.h here, the entire point of the custom
kernel types is to ensure the other kernel headers can use these types
without relying on libc headers.

While some of driver specific kernel headers have libc dependencies
in them, the general rule is to keep the kernel headers as standalone
usable.

       Arnd
  
Zack Weinberg Dec. 2, 2021, 2:55 p.m. UTC | #13
On Mon, Nov 29, 2021, at 9:34 AM, Arnd Bergmann wrote:
> On Mon, Nov 29, 2021 at 12:58 PM Cyril Hrubis <chrubis@suse.cz> wrote:
>>
>> What about guarding the change with __STDINT_COMPATIBLE_TYPES__

In user space, I don't see a compelling need for backward compatibility?  User space's expectation is that the types are *already* the same and we (glibc) regularly get bug reports because they aren't.

I could be persuaded otherwise with an example of a program for which changing
__s64 from 'long long' to 'long' would break *binary* backward compatibility, or
similarly for __u64.

> I don't think we can include stdint.h here, the entire point of the custom
> kernel types is to ensure the other kernel headers can use these types
> without relying on libc headers.

If __KERNEL__ is not defined, though, there should be no issue, right?

From user space's perspective, it's an ongoing source of problems whenever __uN isn't exactly the same "underlying type" as uintN_t, same for __sN and intN_t.  We would really like it if the uapi headers, when included from user space, deferred to the C library for the definitions of these types.

<stdint.h> does define a lot of things beyond just the fixed-width types, and it defines names in the application namespace (i.e. with no __ prefix).  Perhaps we could come to some agreement on a private header, provided by libcs, that *only* defined __{u,}int{8,16,32,64}_t.  glibc already has <bits/types.h> which promises
to define only __-prefix names, but it defines a lot of other types as well (__dev_t, __uid_t, __pid_t, __time_t, etc etc etc).

zw
  
David Howells Dec. 2, 2021, 3:01 p.m. UTC | #14
Zack Weinberg <zack@owlfolio.org> wrote:

> I could be persuaded otherwise with an example of a program for which
> changing __s64 from 'long long' to 'long' would break *binary* backward
> compatibility, or similarly for __u64.

C++ could break.

David
  
Rich Felker Dec. 2, 2021, 3:34 p.m. UTC | #15
On Mon, Nov 22, 2021 at 10:19:59PM +0000, Zack Weinberg via Libc-alpha wrote:
> On Mon, Nov 22, 2021, at 4:43 PM, Cyril Hrubis wrote:
> > This changes the __u64 and __s64 in userspace on 64bit platforms from
> > long long (unsigned) int to just long (unsigned) int in order to match
> > the uint64_t and int64_t size in userspace.
> ....
> > +
> > +#include <asm/bitsperlong.h>
> > +
> >  /*
> > - * int-ll64 is used everywhere now.
> > + * int-ll64 is used everywhere in kernel now.
> >   */
> > -#include <asm-generic/int-ll64.h>
> > +#if __BITS_PER_LONG == 64 && !defined(__KERNEL__)
> > +# include <asm-generic/int-l64.h>
> > +#else
> > +# include <asm-generic/int-ll64.h>
> > +#endif
> 
> I am all for matching __uN / __sN to uintN_t / intN_t in userspace, but may I suggest the technically simpler and guaranteed-to-be-accurate
> 
>  /*
> - * int-ll64 is used everywhere now.
> + * int-ll64 is used everywhere in kernel now.
> + * In user space match <stdint.h>.
>   */
> +#ifdef __KERNEL__
>  # include <asm-generic/int-ll64.h>
> +#elif __has_include (<bits/types.h>)
> +# include <bits/types.h>
> +typedef __int8_t __s8;
> +typedef __uint8_t __u8;
> +typedef __int16_t __s16;
> +typedef __uint16_t __u16;
> +typedef __int32_t __s32;
> +typedef __uint32_t __u32;
> +typedef __int64_t __s64;
> +typedef __uint64_t __u64;
> +#else
> +# include <stdint.h>
> +typedef int8_t __s8;
> +typedef uint8_t __u8;
> +typedef int16_t __s16;
> +typedef uint16_t __u16;
> +typedef int32_t __s32;
> +typedef uint32_t __u32;
> +typedef int64_t __s64;
> +typedef uint64_t __u64;
> +#endif
> 
> The middle clause could be dropped if we are okay with all uapi
> headers potentially exposing the non-implementation-namespace names
> defined by <stdint.h>. I do not know what the musl libc equivalent
> of <bits/types.h> is.

We (musl) don't have an equivalent header or __-prefixed versions of
these types.

FWIW I don't think stdint.h exposes anything that would be problematic
alongside arbitrary use of kernel headers.

Rich
  
Zack Weinberg Dec. 2, 2021, 8:48 p.m. UTC | #16
On Thu, Dec 2, 2021, at 10:01 AM, David Howells via Libc-alpha wrote:
> Zack Weinberg <zack@owlfolio.org> wrote:
>> I could be persuaded otherwise with an example of a program for which
>> changing __s64 from 'long long' to 'long' would break *binary* backward
>> compatibility, or similarly for __u64.
>
> C++ could break.

That's too hypothetical to be actionable.  I would like to see a _specific program_, and I would like it to be one that already exists in the world and was not written as a test case for this hypothetical ABI break.

zw
  
Rich Felker Dec. 2, 2021, 11:29 p.m. UTC | #17
On Thu, Dec 02, 2021 at 10:34:23AM -0500, Rich Felker wrote:
> On Mon, Nov 22, 2021 at 10:19:59PM +0000, Zack Weinberg via Libc-alpha wrote:
> > On Mon, Nov 22, 2021, at 4:43 PM, Cyril Hrubis wrote:
> > > This changes the __u64 and __s64 in userspace on 64bit platforms from
> > > long long (unsigned) int to just long (unsigned) int in order to match
> > > the uint64_t and int64_t size in userspace.
> > ....
> > > +
> > > +#include <asm/bitsperlong.h>
> > > +
> > >  /*
> > > - * int-ll64 is used everywhere now.
> > > + * int-ll64 is used everywhere in kernel now.
> > >   */
> > > -#include <asm-generic/int-ll64.h>
> > > +#if __BITS_PER_LONG == 64 && !defined(__KERNEL__)
> > > +# include <asm-generic/int-l64.h>
> > > +#else
> > > +# include <asm-generic/int-ll64.h>
> > > +#endif
> > 
> > I am all for matching __uN / __sN to uintN_t / intN_t in userspace, but may I suggest the technically simpler and guaranteed-to-be-accurate
> > 
> >  /*
> > - * int-ll64 is used everywhere now.
> > + * int-ll64 is used everywhere in kernel now.
> > + * In user space match <stdint.h>.
> >   */
> > +#ifdef __KERNEL__
> >  # include <asm-generic/int-ll64.h>
> > +#elif __has_include (<bits/types.h>)
> > +# include <bits/types.h>
> > +typedef __int8_t __s8;
> > +typedef __uint8_t __u8;
> > +typedef __int16_t __s16;
> > +typedef __uint16_t __u16;
> > +typedef __int32_t __s32;
> > +typedef __uint32_t __u32;
> > +typedef __int64_t __s64;
> > +typedef __uint64_t __u64;
> > +#else
> > +# include <stdint.h>
> > +typedef int8_t __s8;
> > +typedef uint8_t __u8;
> > +typedef int16_t __s16;
> > +typedef uint16_t __u16;
> > +typedef int32_t __s32;
> > +typedef uint32_t __u32;
> > +typedef int64_t __s64;
> > +typedef uint64_t __u64;
> > +#endif
> > 
> > The middle clause could be dropped if we are okay with all uapi
> > headers potentially exposing the non-implementation-namespace names
> > defined by <stdint.h>. I do not know what the musl libc equivalent
> > of <bits/types.h> is.
> 
> We (musl) don't have an equivalent header or __-prefixed versions of
> these types.
> 
> FWIW I don't think stdint.h exposes anything that would be problematic
> alongside arbitrary use of kernel headers.

Also, per glibc's bits/types.h:

/*
 * Never include this file directly; use <sys/types.h> instead.
 */

it's not permitted (not supported usage) to #include <bits/types.h>.
So I think the above patch is wrong for glibc too. As I understand it,
this is general policy for bits/* -- they're only intended to work as
included by the libc system headers, not directly by something else.

Rich
  
Adhemerval Zanella Dec. 2, 2021, 11:43 p.m. UTC | #18
On 02/12/2021 20:29, Rich Felker wrote:
> On Thu, Dec 02, 2021 at 10:34:23AM -0500, Rich Felker wrote:
>> On Mon, Nov 22, 2021 at 10:19:59PM +0000, Zack Weinberg via Libc-alpha wrote:
>>> On Mon, Nov 22, 2021, at 4:43 PM, Cyril Hrubis wrote:
>>>> This changes the __u64 and __s64 in userspace on 64bit platforms from
>>>> long long (unsigned) int to just long (unsigned) int in order to match
>>>> the uint64_t and int64_t size in userspace.
>>> ....
>>>> +
>>>> +#include <asm/bitsperlong.h>
>>>> +
>>>>  /*
>>>> - * int-ll64 is used everywhere now.
>>>> + * int-ll64 is used everywhere in kernel now.
>>>>   */
>>>> -#include <asm-generic/int-ll64.h>
>>>> +#if __BITS_PER_LONG == 64 && !defined(__KERNEL__)
>>>> +# include <asm-generic/int-l64.h>
>>>> +#else
>>>> +# include <asm-generic/int-ll64.h>
>>>> +#endif
>>>
>>> I am all for matching __uN / __sN to uintN_t / intN_t in userspace, but may I suggest the technically simpler and guaranteed-to-be-accurate
>>>
>>>  /*
>>> - * int-ll64 is used everywhere now.
>>> + * int-ll64 is used everywhere in kernel now.
>>> + * In user space match <stdint.h>.
>>>   */
>>> +#ifdef __KERNEL__
>>>  # include <asm-generic/int-ll64.h>
>>> +#elif __has_include (<bits/types.h>)
>>> +# include <bits/types.h>
>>> +typedef __int8_t __s8;
>>> +typedef __uint8_t __u8;
>>> +typedef __int16_t __s16;
>>> +typedef __uint16_t __u16;
>>> +typedef __int32_t __s32;
>>> +typedef __uint32_t __u32;
>>> +typedef __int64_t __s64;
>>> +typedef __uint64_t __u64;
>>> +#else
>>> +# include <stdint.h>
>>> +typedef int8_t __s8;
>>> +typedef uint8_t __u8;
>>> +typedef int16_t __s16;
>>> +typedef uint16_t __u16;
>>> +typedef int32_t __s32;
>>> +typedef uint32_t __u32;
>>> +typedef int64_t __s64;
>>> +typedef uint64_t __u64;
>>> +#endif
>>>
>>> The middle clause could be dropped if we are okay with all uapi
>>> headers potentially exposing the non-implementation-namespace names
>>> defined by <stdint.h>. I do not know what the musl libc equivalent
>>> of <bits/types.h> is.
>>
>> We (musl) don't have an equivalent header or __-prefixed versions of
>> these types.
>>
>> FWIW I don't think stdint.h exposes anything that would be problematic
>> alongside arbitrary use of kernel headers.
> 
> Also, per glibc's bits/types.h:
> 
> /*
>  * Never include this file directly; use <sys/types.h> instead.
>  */
> 
> it's not permitted (not supported usage) to #include <bits/types.h>.
> So I think the above patch is wrong for glibc too. As I understand it,
> this is general policy for bits/* -- they're only intended to work as
> included by the libc system headers, not directly by something else.

You are right, the idea is to allow glibc to create and remove internal headers.
  
Zack Weinberg Dec. 3, 2021, 12:10 a.m. UTC | #19
On Thu, Dec 2, 2021, at 6:43 PM, Adhemerval Zanella via Libc-alpha wrote:
> On 02/12/2021 20:29, Rich Felker wrote:
>> On Thu, Dec 02, 2021 at 10:34:23AM -0500, Rich Felker wrote:
>>> On Mon, Nov 22, 2021 at 10:19:59PM +0000, Zack Weinberg via Libc-alpha wrote:
>>>> On Mon, Nov 22, 2021, at 4:43 PM, Cyril Hrubis wrote:
>>>>> This changes the __u64 and __s64 in userspace on 64bit platforms from
>>>>> long long (unsigned) int to just long (unsigned) int in order to match
>>>>> the uint64_t and int64_t size in userspace.
>>>> ....
>>>>> +
>>>>> +#include <asm/bitsperlong.h>
>>>>> +
>>>>>  /*
>>>>> - * int-ll64 is used everywhere now.
>>>>> + * int-ll64 is used everywhere in kernel now.
>>>>>   */
>>>>> -#include <asm-generic/int-ll64.h>
>>>>> +#if __BITS_PER_LONG == 64 && !defined(__KERNEL__)
>>>>> +# include <asm-generic/int-l64.h>
>>>>> +#else
>>>>> +# include <asm-generic/int-ll64.h>
>>>>> +#endif
>>>>
>>>> I am all for matching __uN / __sN to uintN_t / intN_t in userspace, but may I suggest the technically simpler and guaranteed-to-be-accurate
>>>>
>>>>  /*
>>>> - * int-ll64 is used everywhere now.
>>>> + * int-ll64 is used everywhere in kernel now.
>>>> + * In user space match <stdint.h>.
>>>>   */
>>>> +#ifdef __KERNEL__
>>>>  # include <asm-generic/int-ll64.h>
>>>> +#elif __has_include (<bits/types.h>)
>>>> +# include <bits/types.h>
>>>> +typedef __int8_t __s8;
>>>> +typedef __uint8_t __u8;
>>>> +typedef __int16_t __s16;
>>>> +typedef __uint16_t __u16;
>>>> +typedef __int32_t __s32;
>>>> +typedef __uint32_t __u32;
>>>> +typedef __int64_t __s64;
>>>> +typedef __uint64_t __u64;
>>>> +#else
>>>> +# include <stdint.h>
>>>> +typedef int8_t __s8;
>>>> +typedef uint8_t __u8;
>>>> +typedef int16_t __s16;
>>>> +typedef uint16_t __u16;
>>>> +typedef int32_t __s32;
>>>> +typedef uint32_t __u32;
>>>> +typedef int64_t __s64;
>>>> +typedef uint64_t __u64;
>>>> +#endif
>>>>
>>>> The middle clause could be dropped if we are okay with all uapi
>>>> headers potentially exposing the non-implementation-namespace names
>>>> defined by <stdint.h>. I do not know what the musl libc equivalent
>>>> of <bits/types.h> is.
>>>
>>> We (musl) don't have an equivalent header or __-prefixed versions of
>>> these types.
>>>
>>> FWIW I don't think stdint.h exposes anything that would be problematic
>>> alongside arbitrary use of kernel headers.
>> 
>> Also, per glibc's bits/types.h:
>> 
>> /*
>>  * Never include this file directly; use <sys/types.h> instead.
>>  */
>> 
>> it's not permitted (not supported usage) to #include <bits/types.h>.
>> So I think the above patch is wrong for glibc too. As I understand it,
>> this is general policy for bits/* -- they're only intended to work as
>> included by the libc system headers, not directly by something else.
>
> You are right, the idea is to allow glibc to create and remove internal headers.

As a general rule yes, but we could make a deal that some specific bits headers are permanent API for use by things like this. They probably should be less of a dumping ground than bits/types.h though.
  
Adhemerval Zanella Dec. 3, 2021, 12:32 p.m. UTC | #20
On 02/12/2021 21:10, Zack Weinberg wrote:
> On Thu, Dec 2, 2021, at 6:43 PM, Adhemerval Zanella via Libc-alpha wrote:
>> On 02/12/2021 20:29, Rich Felker wrote:
>>> On Thu, Dec 02, 2021 at 10:34:23AM -0500, Rich Felker wrote:
>>>> On Mon, Nov 22, 2021 at 10:19:59PM +0000, Zack Weinberg via Libc-alpha wrote:
>>>>> On Mon, Nov 22, 2021, at 4:43 PM, Cyril Hrubis wrote:
>>>>>> This changes the __u64 and __s64 in userspace on 64bit platforms from
>>>>>> long long (unsigned) int to just long (unsigned) int in order to match
>>>>>> the uint64_t and int64_t size in userspace.
>>>>> ....
>>>>>> +
>>>>>> +#include <asm/bitsperlong.h>
>>>>>> +
>>>>>>  /*
>>>>>> - * int-ll64 is used everywhere now.
>>>>>> + * int-ll64 is used everywhere in kernel now.
>>>>>>   */
>>>>>> -#include <asm-generic/int-ll64.h>
>>>>>> +#if __BITS_PER_LONG == 64 && !defined(__KERNEL__)
>>>>>> +# include <asm-generic/int-l64.h>
>>>>>> +#else
>>>>>> +# include <asm-generic/int-ll64.h>
>>>>>> +#endif
>>>>>
>>>>> I am all for matching __uN / __sN to uintN_t / intN_t in userspace, but may I suggest the technically simpler and guaranteed-to-be-accurate
>>>>>
>>>>>  /*
>>>>> - * int-ll64 is used everywhere now.
>>>>> + * int-ll64 is used everywhere in kernel now.
>>>>> + * In user space match <stdint.h>.
>>>>>   */
>>>>> +#ifdef __KERNEL__
>>>>>  # include <asm-generic/int-ll64.h>
>>>>> +#elif __has_include (<bits/types.h>)
>>>>> +# include <bits/types.h>
>>>>> +typedef __int8_t __s8;
>>>>> +typedef __uint8_t __u8;
>>>>> +typedef __int16_t __s16;
>>>>> +typedef __uint16_t __u16;
>>>>> +typedef __int32_t __s32;
>>>>> +typedef __uint32_t __u32;
>>>>> +typedef __int64_t __s64;
>>>>> +typedef __uint64_t __u64;
>>>>> +#else
>>>>> +# include <stdint.h>
>>>>> +typedef int8_t __s8;
>>>>> +typedef uint8_t __u8;
>>>>> +typedef int16_t __s16;
>>>>> +typedef uint16_t __u16;
>>>>> +typedef int32_t __s32;
>>>>> +typedef uint32_t __u32;
>>>>> +typedef int64_t __s64;
>>>>> +typedef uint64_t __u64;
>>>>> +#endif
>>>>>
>>>>> The middle clause could be dropped if we are okay with all uapi
>>>>> headers potentially exposing the non-implementation-namespace names
>>>>> defined by <stdint.h>. I do not know what the musl libc equivalent
>>>>> of <bits/types.h> is.
>>>>
>>>> We (musl) don't have an equivalent header or __-prefixed versions of
>>>> these types.
>>>>
>>>> FWIW I don't think stdint.h exposes anything that would be problematic
>>>> alongside arbitrary use of kernel headers.
>>>
>>> Also, per glibc's bits/types.h:
>>>
>>> /*
>>>  * Never include this file directly; use <sys/types.h> instead.
>>>  */
>>>
>>> it's not permitted (not supported usage) to #include <bits/types.h>.
>>> So I think the above patch is wrong for glibc too. As I understand it,
>>> this is general policy for bits/* -- they're only intended to work as
>>> included by the libc system headers, not directly by something else.
>>
>> You are right, the idea is to allow glibc to create and remove internal headers.
> 
> As a general rule yes, but we could make a deal that some specific bits headers are permanent API for use by things like this. They probably should be less of a dumping ground than bits/types.h though.

I really don't think adding such constraints really would improve the project
in long term, we already have issues about the need to support some internal
symbols that were exported by accident.
  
Alejandro Colomar Dec. 3, 2021, 12:54 p.m. UTC | #21
On 12/3/21 13:32, Adhemerval Zanella via Libc-alpha wrote:
>>>>> We (musl) don't have an equivalent header or __-prefixed versions of
>>>>> these types.
>>>>>
>>>>> FWIW I don't think stdint.h exposes anything that would be problematic
>>>>> alongside arbitrary use of kernel headers.

>>>>
>>>> Also, per glibc's bits/types.h:
>>>>
>>>> /*
>>>>   * Never include this file directly; use <sys/types.h> instead.
>>>>   */
>>>>
>>>> it's not permitted (not supported usage) to #include <bits/types.h>.
>>>> So I think the above patch is wrong for glibc too. As I understand it,
>>>> this is general policy for bits/* -- they're only intended to work as
>>>> included by the libc system headers, not directly by something else.
>>>
>>> You are right, the idea is to allow glibc to create and remove internal headers.
>>
>> As a general rule yes, but we could make a deal that some specific bits headers are permanent API for use by things like this. They probably should be less of a dumping ground than bits/types.h though.
> 
> I really don't think adding such constraints really would improve the project
> in long term, we already have issues about the need to support some internal
> symbols that were exported by accident.
> 

I think there are a few headers that should be safe to include from the 
kernel.

Could anyone foresee any problem that could arise by including any of 
the following headers in kernel code?:

<stdint.h>
<stddef.h>

They are the intended interface, and they only provide macros and types 
but not functions, and there should be no need to require libcs to 
define another identical stable interface.  If there's an existing 
program that would break by including any of those in uapi headers, I'm 
curious to see that program.

Of course, the kernel already defined some of the macros defined there, 
but the solution would be easy:  remove the redefinitions, since they 
should be defined to equivalent code (e.g., offsetof(), NULL; although 
these are from <stddef.h>, which for this change would be unnecessary, 
but if <stdint.h> can be used within the kernel, a next step could be to 
rely on libc <stddef.h>)

Thanks,
Alex
  
Cyril Hrubis Dec. 8, 2021, 3:33 p.m. UTC | #22
Hi!
> > I could be persuaded otherwise with an example of a program for which
> > changing __s64 from 'long long' to 'long' would break *binary* backward
> > compatibility, or similarly for __u64.
> 
> C++ could break.

Thinking of this again we can detect C++ as well so it can be safely
enabled just for C with:

#if !defined(__KERNEL__) && !defined(__cplusplus) && __BITSPERLONG == 64
# include <asm-generic/int-l64.h>
#else
# include <asm-generic/int-ll64.h>
#endif
  
Alejandro Colomar June 17, 2022, 12:13 p.m. UTC | #23
Hi Cyril,

On 12/8/21 16:33, Cyril Hrubis wrote:
> Hi!
>>> I could be persuaded otherwise with an example of a program for which
>>> changing __s64 from 'long long' to 'long' would break *binary* backward
>>> compatibility, or similarly for __u64.
>>
>> C++ could break.
> 
> Thinking of this again we can detect C++ as well so it can be safely
> enabled just for C with:
> 
> #if !defined(__KERNEL__) && !defined(__cplusplus) && __BITSPERLONG == 64
> # include <asm-generic/int-l64.h>
> #else
> # include <asm-generic/int-ll64.h>
> #endif
> 

I'm very interested in seeing this merged, as that would allow 
simplifying the man-pages by removing unnecessary kernel details such as 
u64[1].  How is the state of this patch?

Cheers,

Alex


[1]: 
<https://lore.kernel.org/linux-man/20210423230609.13519-1-alx.manpages@gmail.com/T/#u>
  
Cyril Hrubis June 17, 2022, 3:04 p.m. UTC | #24
Hi!
> >>> I could be persuaded otherwise with an example of a program for which
> >>> changing __s64 from 'long long' to 'long' would break *binary* backward
> >>> compatibility, or similarly for __u64.
> >>
> >> C++ could break.
> > 
> > Thinking of this again we can detect C++ as well so it can be safely
> > enabled just for C with:
> > 
> > #if !defined(__KERNEL__) && !defined(__cplusplus) && __BITSPERLONG == 64
> > # include <asm-generic/int-l64.h>
> > #else
> > # include <asm-generic/int-ll64.h>
> > #endif
> > 
> 
> I'm very interested in seeing this merged, as that would allow 
> simplifying the man-pages by removing unnecessary kernel details such as 
> u64[1].  How is the state of this patch?

I guess that it stalled because I haven't posted it as an actual patch,
I should do so to get this back on a track.
  

Patch

diff --git a/include/uapi/asm-generic/types.h b/include/uapi/asm-generic/types.h
index dfaa50d99d8f..ae748a3678a4 100644
--- a/include/uapi/asm-generic/types.h
+++ b/include/uapi/asm-generic/types.h
@@ -1,9 +1,16 @@ 
 /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
 #ifndef _ASM_GENERIC_TYPES_H
 #define _ASM_GENERIC_TYPES_H
+
+#include <asm/bitsperlong.h>
+
 /*
- * int-ll64 is used everywhere now.
+ * int-ll64 is used everywhere in kernel now.
  */
-#include <asm-generic/int-ll64.h>
+#if __BITS_PER_LONG == 64 && !defined(__KERNEL__)
+# include <asm-generic/int-l64.h>
+#else
+# include <asm-generic/int-ll64.h>
+#endif
 
 #endif /* _ASM_GENERIC_TYPES_H */