[v2,2/7] y2038: Introduce __ASSUME_64BIT_TIME define

Message ID 20190429104613.16209-3-lukma@denx.de
State Superseded
Headers

Commit Message

Lukasz Majewski April 29, 2019, 10:46 a.m. UTC
  This define indicates if the Linux kernel (5.1+) provides 64 bit versions
of time related syscalls (e.g. clock_settime64, clock_nanosleep_time64).

Those syscalls are now available on actively supported Linux architectures
and most of all are providing Y2038 correct time on 32 bit systems.

* sysdeps/unix/sysv/linux/kernel-features.h: (__ASSUME_64BIT_TIME):
[__LINUX_KERNEL_VERSION >= 0x050100]: Define.

---
Changes for v2:
- New patch
---
 sysdeps/unix/sysv/linux/kernel-features.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)
  

Comments

Joseph Myers April 29, 2019, 9:50 p.m. UTC | #1
On Mon, 29 Apr 2019, Lukasz Majewski wrote:

> +/* Support for 64 bit version of clock_* Linux syscalls.
> +
> +   Support for following time related (and Y2038 safe) syscalls has been added
> +   in the 5.1 Linux kernel:
> +
> +   clock_gettime64 (nr. 403)
> +   clock_settime64 (nr. 404)
> +   clock_getres_time64 (nr. 406)
> +   clock_nanosleep_time64 (nr. 407)
> +  */
> +#if __LINUX_KERNEL_VERSION >= 0x050100
> +# define __ASSUME_64BIT_TIME 1
> +#endif

This comment and macro definition are the key thing that need reviewing, 
probably over several iterations, before the rest of this patch series can 
be properly reviewed.  It is critical that the comment is completely clear 
and unambiguous about the exact macro semantics on the various relevant 
classes of architectures.

See what I wrote in 
<https://sourceware.org/ml/libc-alpha/2018-09/msg00448.html> and 
<https://sourceware.org/ml/libc-alpha/2018-12/msg00568.html>.  I don't 
think the comment meets those requirements at present - that is, if you 
try to deduce from it what the macro definition should be on all the 
listed classes of architectures, either the conclusion is not clear or it 
sometimes conflicts with the actual definition.

In particular, for existing 64-bit architectures, my understanding is that 
the kernel *does not* add new syscall names or numbers for the syscalls 
you list, and so it would be incorrect to define the macro in that case, 
but this patch defines it anyway.

Note 1: the comment should not reference the above URLs; it should be 
self-contained.  As stated in the second message above, it needs to be 
clear to people who haven't read any of the mailing list discussions 
around Y2038 issues.

Note 2: if the comment actually needs to define the classes 1a, 1b, 2a, 
2b, 3, it's probably using the wrong abstractions.  It should be very 
careful to refer to the abstraction that actually most reliably determines 
the presence or absence of the new syscalls (which might be the size of 
"long int" used in the syscall interface - glibc's __syscall_slong_t, 
which happens always to be the same size as __TIMESIZE for existing glibc 
ports but might not be for future ports - but make sure of that).  Once 
the relevant abstraction is very clear, the reader can deduce the answer 
for each class of glibc ports.

Note 3: it's wrong to state the syscall numbers in the comment; that is 
not a relevant part of understanding the interface.  Stating the names, 
however, makes sense, provided you make sure not to use the __ASSUME_ 
macro for any *other* syscalls without updating the comment, and, at that 
time, reviewing whether the same definition conditions still work for all 
those syscalls.  (Given that, as previously discussed, there might be 
*some* new syscalls even for architectures that already have 64-bit time, 
in order to provide timespec-based versions of syscalls currently using 
timeval.)
  
Lukasz Majewski April 30, 2019, 9:05 a.m. UTC | #2
Hi Joseph,

Thanks for your reply.

> On Mon, 29 Apr 2019, Lukasz Majewski wrote:
> 
> > +/* Support for 64 bit version of clock_* Linux syscalls.
> > +
> > +   Support for following time related (and Y2038 safe) syscalls
> > has been added
> > +   in the 5.1 Linux kernel:
> > +
> > +   clock_gettime64 (nr. 403)
> > +   clock_settime64 (nr. 404)
> > +   clock_getres_time64 (nr. 406)
> > +   clock_nanosleep_time64 (nr. 407)
> > +  */
> > +#if __LINUX_KERNEL_VERSION >= 0x050100
> > +# define __ASSUME_64BIT_TIME 1
> > +#endif  
> 
> This comment and macro definition are the key thing that need
> reviewing, probably over several iterations,

Ok. Please find my comments/concerns below regarding this __ASSUME
define.

> before the rest of this
> patch series can be properly reviewed.

I would like to point out one thing - the rest of this patch series
also has an important goal - reviewing them would set the direction
(despite the __ASSUME discussion) for future Y2038 development and
conversion of other parts of glibc.

For example:

 - The decisions about the shape of internal/external struct timespec
   in glibc.

 - The need for explicit clearing padding when calling syscalls (as to
   be better safe than sorry in the future - there was related
   discussion started by Stepan).

 - If the conversion itself (__clock_settime64 vs clock_settime) is
   correct/acceptable.

Would greatly facilitate the development process and reduce the number
of iterations.

>  It is critical that the
> comment is completely clear and unambiguous about the exact macro
> semantics on the various relevant classes of architectures.
> 
> See what I wrote in 
> <https://sourceware.org/ml/libc-alpha/2018-09/msg00448.html> and 
> <https://sourceware.org/ml/libc-alpha/2018-12/msg00568.html>.  I
> don't think the comment meets those requirements at present - that
> is, if you try to deduce from it what the macro definition should be
> on all the listed classes of architectures, either the conclusion is
> not clear or it sometimes conflicts with the actual definition.
> 
> In particular, for existing 64-bit architectures, my understanding is
> that the kernel *does not* add new syscall names or numbers for the
> syscalls you list,

With the 5.1-rc6 it seems like 64 bit architectures are not add those
syscalls (like e.g. clock_settime64).

> and so it would be incorrect to define the macro
> in that case, but this patch defines it anyway.

You are right here - the 

#if __TIMESIZE != 64
# if __LINUX_KERNEL_VERSION >= 0x050100
#  define __ASSUME_64BIT_TIME 1
# endif
#endif

would do the trick.

> 
> Note 1: the comment should not reference the above URLs; it should be 
> self-contained.  As stated in the second message above, it needs to
> be clear to people who haven't read any of the mailing list
> discussions around Y2038 issues.

Ok. I will prepare self contained comment.

> 
> Note 2: if the comment actually needs to define the classes 1a, 1b,
> 2a, 2b, 3, it's probably using the wrong abstractions.  It should be
> very careful to refer to the abstraction that actually most reliably
> determines the presence or absence of the new syscalls (which might
> be the size of "long int" used in the syscall interface - glibc's
> __syscall_slong_t, which happens always to be the same size as
> __TIMESIZE for existing glibc ports but might not be for future ports
> - but make sure of that). 

The ABI (syscalls) compatibility was one of the concerns raised in this
patch series as a whole.

The glibc's internal struct __timespec64 passed to e.g. clock_settime64
syscall has explicit __time64_t (tv_sec), __int32_t (tv_nsec) and 32 bit
padding.


> Once the relevant abstraction is very
> clear, the reader can deduce the answer for each class of glibc ports.

IMHO, the abstraction would be:

1. The __ASSUME_64BIT_TIME is _never_ defined for 64 bit native systems

2. It is defined by default in:
sysdeps/unix/sysv/linux/kernel-features.h for 32 bit systems (and the
actual presence of the syscall is decided upon definitions of __NR_xxx*
(i.e. # ifdef __NR_clock_settime64).

As those syscalls are provided on almost every 32 bit system now
(5.1-rc6):
git grep -n "clock_settime64"

gives support for: arm, arm64 (compat mode), m68k, microblaze, mips,
parisc, powerpc, s390, sh, sparc, x86, xtensa

So it would be reasonable to just add this __ASSUME definition code to
sysdeps/unix/sysv/linux/kernel-features.h and #undef it for
architectures not supporting it (i.e. c-sky and riscv).

> 
> Note 3: it's wrong to state the syscall numbers in the comment; that
> is not a relevant part of understanding the interface.  Stating the
> names, however, makes sense, provided you make sure not to use the
> __ASSUME_ macro for any *other* syscalls without updating the
> comment, and, at that time, reviewing whether the same definition
> conditions still work for all those syscalls. 

Correct, the names of supported syscalls also shall be written to the
comment (and the comment itself shall be extended with other, upcoming
patches).

> (Given that, as
> previously discussed, there might be *some* new syscalls even for
> architectures that already have 64-bit time, in order to provide
> timespec-based versions of syscalls currently using timeval.)
> 



Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Stepan Golosunov April 30, 2019, 9:47 p.m. UTC | #3
30.04.2019 в 11:05:05 +0200 Lukasz Majewski написал:
> #if __TIMESIZE != 64
> # if __LINUX_KERNEL_VERSION >= 0x050100
> #  define __ASSUME_64BIT_TIME 1
> # endif
> #endif

I think __WORDSIZE would be more appropriate here than __TIMESIZE.


> IMHO, the abstraction would be:
> 
> 1. The __ASSUME_64BIT_TIME is _never_ defined for 64 bit native systems
> 
> 2. It is defined by default in:
> sysdeps/unix/sysv/linux/kernel-features.h for 32 bit systems (and the
> actual presence of the syscall is decided upon definitions of __NR_xxx*
> (i.e. # ifdef __NR_clock_settime64).

I think that __NR_clock_settime64 should be used unconditionally when
__ASSUME_64BIT_TIME is defined.

(__ASSUME_TIME64_SYSCALLS would probably be better name.)


> As those syscalls are provided on almost every 32 bit system now
> (5.1-rc6):
> git grep -n "clock_settime64"
> 
> gives support for: arm, arm64 (compat mode), m68k, microblaze, mips,
> parisc, powerpc, s390, sh, sparc, x86, xtensa
> 
> So it would be reasonable to just add this __ASSUME definition code to
> sysdeps/unix/sysv/linux/kernel-features.h and #undef it for
> architectures not supporting it (i.e. c-sky and riscv).

I believe that the only 32-bit architecture without
__NR_clock_settime64 is x32.  While newer 32-bit architectures like
riscv do not have __NR_clock_settime:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d4c08b9776b392e20efc6198ebe1bc8ec1911d9b
  
Lukasz Majewski May 2, 2019, 8:51 a.m. UTC | #4
Hi Stepan,

Thank you for your reply.

> 30.04.2019 в 11:05:05 +0200 Lukasz Majewski написал:
> > #if __TIMESIZE != 64
> > # if __LINUX_KERNEL_VERSION >= 0x050100
> > #  define __ASSUME_64BIT_TIME 1
> > # endif
> > #endif  
> 
> I think __WORDSIZE would be more appropriate here than __TIMESIZE.
> 

Yes. I do agree.

> 
> > IMHO, the abstraction would be:
> > 
> > 1. The __ASSUME_64BIT_TIME is _never_ defined for 64 bit native
> > systems
> > 
> > 2. It is defined by default in:
> > sysdeps/unix/sysv/linux/kernel-features.h for 32 bit systems (and
> > the actual presence of the syscall is decided upon definitions of
> > __NR_xxx* (i.e. # ifdef __NR_clock_settime64).  
> 
> I think that __NR_clock_settime64 should be used unconditionally when
> __ASSUME_64BIT_TIME is defined.

Could you clarify it a bit?

In the code as proposed in:
https://patchwork.ozlabs.org/patch/1092583/

The call to clock_settime64 is protected with # ifdef
__NR_clock_settime64 - otherwise we do a fallback to (32 bit)
clock_settime.

Moreover, the # ifdef __ASSUME_64BIT_TIME provides a fallback path if
kernel version is older than 5.1.

> 
> (__ASSUME_TIME64_SYSCALLS would probably be better name.)

I do tend to agree. The __ASSUME_TIME64_SYSCALLS is more descriptive
than __ASSUME_64BIT_TIME.

> 
> 
> > As those syscalls are provided on almost every 32 bit system now
> > (5.1-rc6):
> > git grep -n "clock_settime64"
> > 
> > gives support for: arm, arm64 (compat mode), m68k, microblaze, mips,
> > parisc, powerpc, s390, sh, sparc, x86, xtensa
> > 
> > So it would be reasonable to just add this __ASSUME definition code
> > to sysdeps/unix/sysv/linux/kernel-features.h and #undef it for
> > architectures not supporting it (i.e. c-sky and riscv).  
> 
> I believe that the only 32-bit architecture without
> __NR_clock_settime64 is x32. 

Ok, I see. 

Please correct me - would it be feasible to just #undef
__ASSYME_TIME64_SYSCALLS for x32 ?

> While newer 32-bit architectures like
> riscv do not have __NR_clock_settime:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d4c08b9776b392e20efc6198ebe1bc8ec1911d9b

Then it shall use clock_settime64 from the outset if support added.

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Stepan Golosunov May 2, 2019, 11:54 a.m. UTC | #5
02.05.2019 в 10:51:08 +0200 Lukasz Majewski написал:
> Hi Stepan,
> 
> > 30.04.2019 в 11:05:05 +0200 Lukasz Majewski написал:
> > > IMHO, the abstraction would be:
> > > 
> > > 1. The __ASSUME_64BIT_TIME is _never_ defined for 64 bit native
> > > systems
> > > 
> > > 2. It is defined by default in:
> > > sysdeps/unix/sysv/linux/kernel-features.h for 32 bit systems (and
> > > the actual presence of the syscall is decided upon definitions of
> > > __NR_xxx* (i.e. # ifdef __NR_clock_settime64).  
> > 
> > I think that __NR_clock_settime64 should be used unconditionally when
> > __ASSUME_64BIT_TIME is defined.
> 
> Could you clarify it a bit?

I meant something like this:

int
__clock_settime64 (clockid_t clock_id, const struct __timespec64 *tp)
{
#ifdef __ASSUME_64BIT_TIME
  return INLINE_SYSCALL_CALL (clock_settime64, clock_id, tp);
#else
…

But I see now that most of the existing code would just miscompile in
cases where __ASSUME_* is defined while corresponding __NR_* is not.

> In the code as proposed in:
> https://patchwork.ozlabs.org/patch/1092583/
> 
> The call to clock_settime64 is protected with # ifdef
> __NR_clock_settime64 - otherwise we do a fallback to (32 bit)
> clock_settime.
> 
> Moreover, the # ifdef __ASSUME_64BIT_TIME provides a fallback path if
> kernel version is older than 5.1.

The fallback would be wrong in cases where __NR_clock_settime is not
defined or is not 32-bit.

> > > As those syscalls are provided on almost every 32 bit system now
> > > (5.1-rc6):
> > > git grep -n "clock_settime64"
> > > 
> > > gives support for: arm, arm64 (compat mode), m68k, microblaze, mips,
> > > parisc, powerpc, s390, sh, sparc, x86, xtensa
> > > 
> > > So it would be reasonable to just add this __ASSUME definition code
> > > to sysdeps/unix/sysv/linux/kernel-features.h and #undef it for
> > > architectures not supporting it (i.e. c-sky and riscv).  
> > 
> > I believe that the only 32-bit architecture without
> > __NR_clock_settime64 is x32. 
> 
> Ok, I see. 
> 
> Please correct me - would it be feasible to just #undef
> __ASSYME_TIME64_SYSCALLS for x32 ?

You'll need to know whether to use __NR_clock_settime64 or
__NR_clock_settime in cases where __TIMESIZE == 64 and
__WORDSIZE == 32.

One way would be by defining __ASSUME_TIME64_SYSCALLS unconditionally
on x32 and then defining __NR_clock_settime64 to __NR_clock_settime
when __ASSUME_TIME64_SYSCALLS is defined while __NR_clock_settime64
isn't.

> > While newer 32-bit architectures like
> > riscv do not have __NR_clock_settime:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d4c08b9776b392e20efc6198ebe1bc8ec1911d9b
> 
> Then it shall use clock_settime64 from the outset if support added.

It probably should have __TIMESIZE == 64 though.
  
Lukasz Majewski May 2, 2019, 1:55 p.m. UTC | #6
Hi Stepan,

> 02.05.2019 в 10:51:08 +0200 Lukasz Majewski написал:
> > Hi Stepan,
> >   
> > > 30.04.2019 в 11:05:05 +0200 Lukasz Majewski написал:  
> > > > IMHO, the abstraction would be:
> > > > 
> > > > 1. The __ASSUME_64BIT_TIME is _never_ defined for 64 bit native
> > > > systems
> > > > 
> > > > 2. It is defined by default in:
> > > > sysdeps/unix/sysv/linux/kernel-features.h for 32 bit systems
> > > > (and the actual presence of the syscall is decided upon
> > > > definitions of __NR_xxx* (i.e. # ifdef
> > > > __NR_clock_settime64).    
> > > 
> > > I think that __NR_clock_settime64 should be used unconditionally
> > > when __ASSUME_64BIT_TIME is defined.  
> > 
> > Could you clarify it a bit?  
> 
> I meant something like this:
> 
> int
> __clock_settime64 (clockid_t clock_id, const struct __timespec64 *tp)
> {
> #ifdef __ASSUME_64BIT_TIME
>   return INLINE_SYSCALL_CALL (clock_settime64, clock_id, tp);
> #else
> …
> 
> But I see now that most of the existing code would just miscompile in
> cases where __ASSUME_* is defined while corresponding __NR_* is not.
> 
> > In the code as proposed in:
> > https://patchwork.ozlabs.org/patch/1092583/
> > 
> > The call to clock_settime64 is protected with # ifdef
> > __NR_clock_settime64 - otherwise we do a fallback to (32 bit)
> > clock_settime.
> > 
> > Moreover, the # ifdef __ASSUME_64BIT_TIME provides a fallback path
> > if kernel version is older than 5.1.  
> 
> The fallback would be wrong in cases where __NR_clock_settime is not
> defined or is not 32-bit.
> 
> > > > As those syscalls are provided on almost every 32 bit system now
> > > > (5.1-rc6):
> > > > git grep -n "clock_settime64"
> > > > 
> > > > gives support for: arm, arm64 (compat mode), m68k, microblaze,
> > > > mips, parisc, powerpc, s390, sh, sparc, x86, xtensa
> > > > 
> > > > So it would be reasonable to just add this __ASSUME definition
> > > > code to sysdeps/unix/sysv/linux/kernel-features.h and #undef it
> > > > for architectures not supporting it (i.e. c-sky and riscv).    
> > > 
> > > I believe that the only 32-bit architecture without
> > > __NR_clock_settime64 is x32.   
> > 
> > Ok, I see. 
> > 
> > Please correct me - would it be feasible to just #undef
> > __ASSYME_TIME64_SYSCALLS for x32 ?  
> 
> You'll need to know whether to use __NR_clock_settime64 or
> __NR_clock_settime in cases where __TIMESIZE == 64 and
> __WORDSIZE == 32.
> 
> One way would be by defining __ASSUME_TIME64_SYSCALLS unconditionally
> on x32 and then defining __NR_clock_settime64 to __NR_clock_settime
> when __ASSUME_TIME64_SYSCALLS is defined while __NR_clock_settime64
> isn't.
> 

I see. Thanks for the hint.

> > > While newer 32-bit architectures like
> > > riscv do not have __NR_clock_settime:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d4c08b9776b392e20efc6198ebe1bc8ec1911d9b  
> > 
> > Then it shall use clock_settime64 from the outset if support
> > added.  
> 
> It probably should have __TIMESIZE == 64 though.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Joseph Myers May 2, 2019, 3:04 p.m. UTC | #7
On Tue, 30 Apr 2019, Lukasz Majewski wrote:

>  - The need for explicit clearing padding when calling syscalls (as to
>    be better safe than sorry in the future - there was related
>    discussion started by Stepan).

This really isn't a difficult question.  What it comes down to is whether 
the Linux kernel, in the first release version with these syscalls (we 
don't care about old -rc versions; what matters is the actual 5.1 
release), ignores the padding.

If 5.1 *release* ignores the padding, that is part of the kernel/userspace 
ABI, in accordance with the kernel principle of not breaking userspace.  
Thus, it is something userspace can rely on, now and in the future.

If 5.1 release does not ignore the padding, syscall presence does not mean 
the padding is ignored by the kernel and so glibc needs to clear padding.  
Of course, it needs to clear padding in a *copy* of the value provided by 
the user unless the glibc API in question requires the timespec value in 
question to be in writable memory.

So, which is (or will be) the case in 5.1 release?  Padding ignored or 
not?  If more complicated (ignored for some architectures / ABIs but not 
for others, or depending on whether compat syscalls are in use), then say 
so - give a precise description of the exact circumstances under which the 
padding around a 32-bit tv_nsec will or will not be ignored by the kernel 
on input from userspace.

(x32 is a separate matter, as it already has 64-bit times, and a 
non-POSIX-conforming tv_nsec, so this patch series just needs to avoid 
breaking anything that currently works there.  Any fix for bug 16437 would 
need to involve clearing padding in userspace, unless not only the kernel 
changes to ignore that padding but all kernel versions without such a 
change cease to be supported by glibc for x32.)

> You are right here - the 
> 
> #if __TIMESIZE != 64
> # if __LINUX_KERNEL_VERSION >= 0x050100
> #  define __ASSUME_64BIT_TIME 1
> # endif
> #endif
> 
> would do the trick.

But that wouldn't be right for *future* configurations where the kernel 
"long" is 32-bit but only 64-bit time is supported in the kernel and glibc 
(so __TIMESIZE is 64, and only the new syscalls and not the old ones are 
supported).  That is, the right abstraction here is not really __TIMESIZE.

It's possible it's __SYSCALL_WORDSIZE, except that's only defined for 
x86_64, so would need to be made more generally available if it's to be 
used here.  And if made more generally available, it would need a careful 
comment explaining exactly what it means.

> 1. The __ASSUME_64BIT_TIME is _never_ defined for 64 bit native systems
> 
> 2. It is defined by default in:
> sysdeps/unix/sysv/linux/kernel-features.h for 32 bit systems (and the

It would be best to avoid descriptions such as "64 bit native systems" and 
"32 bit systems" when defining the relevant abstractions, because they are 
simply too ambiguous in this context, where one of the key thing to do is 
to make the semantics obvious in cases that have some attributes of 32-bit 
systems and other attributes of 64-bit systems.

We have configurations such as x32 and MIPS n32 which have 64-bit 
registers but 32-bit "long" and pointers.  Are those 64-bit or 32-bit?  As 
far as glibc's definition of __WORDSIZE is concerned, they are 32-bit; 
__WORDSIZE is the size in bits of long and pointers.  As far as optimized 
code working one word at a time is concerned (libm functions, string 
functions, etc.), they are best considered 64-bit, because of the 64-bit 
registers.  For the present issue, they are *different* from each other: 
x32 does not have the new syscalls (it already had 64-bit times), n32 does 
have the new syscalls (it previously had 32-bit times).

Again, I think the size of __syscall_slong_t is probably what's relevant.  
Note that "size of long for syscalls" (which is 64-bit for x32 but 32-bit 
for n32) is *not* the same thing as "size passed in a single register for 
syscalls" (n32 passes 64-bits values in a single register to syscalls, on 
the principle of keeping the ABI for those similar to that for normal 
function calls; but there have been more recent suggestions in the kernel 
context - I don't know the conclusion from them - of whether future such 
ILP32 ABIs with 64-bit registers should be more similar to the ABIs using 
32-bit registers, to allow compat syscall code to be used for them more 
consistently).

> As those syscalls are provided on almost every 32 bit system now
> (5.1-rc6):
> git grep -n "clock_settime64"
> 
> gives support for: arm, arm64 (compat mode), m68k, microblaze, mips,
> parisc, powerpc, s390, sh, sparc, x86, xtensa
> 
> So it would be reasonable to just add this __ASSUME definition code to
> sysdeps/unix/sysv/linux/kernel-features.h and #undef it for
> architectures not supporting it (i.e. c-sky and riscv).

No, that's not accurate.  Newer architectures (such as csky and riscv) use 
the asm-generic syscall table and so get these syscalls automatically if 
__BITS_PER_LONG == 32.  So it would be wrong to #undef in those cases.

When checking each glibc architecture / ABI combination, to see if the 
syscalls are present in the kernel, you need to allow for some 
architectures using asm-generic (which means that for such architectures 
you only need to check the generic logic, then look at any compat syscall 
tables, such as for arm on arm64).  For architectures not using 
asm-generic you need to check the per-architecture syscalls tables for 
each relevant ABI.
  
Stepan Golosunov May 5, 2019, 2:10 p.m. UTC | #8
02.05.2019 в 15:04:18 +0000 Joseph Myers написал:
> On Tue, 30 Apr 2019, Lukasz Majewski wrote:
> 
> >  - The need for explicit clearing padding when calling syscalls (as to
> >    be better safe than sorry in the future - there was related
> >    discussion started by Stepan).
> 
> This really isn't a difficult question.  What it comes down to is whether 
> the Linux kernel, in the first release version with these syscalls (we 
> don't care about old -rc versions; what matters is the actual 5.1 
> release), ignores the padding.
> 
> If 5.1 *release* ignores the padding, that is part of the kernel/userspace 
> ABI, in accordance with the kernel principle of not breaking userspace.  
> Thus, it is something userspace can rely on, now and in the future.
> 
> If 5.1 release does not ignore the padding, syscall presence does not mean 
> the padding is ignored by the kernel and so glibc needs to clear padding.  
> Of course, it needs to clear padding in a *copy* of the value provided by 
> the user unless the glibc API in question requires the timespec value in 
> question to be in writable memory.
> 
> So, which is (or will be) the case in 5.1 release?  Padding ignored or 
> not?  If more complicated (ignored for some architectures / ABIs but not 
> for others, or depending on whether compat syscalls are in use), then say 
> so - give a precise description of the exact circumstances under which the 
> padding around a 32-bit tv_nsec will or will not be ignored by the kernel 
> on input from userspace.

In current linux git it looks like padding is correctly ignored in
32-bit kernels (because kernel itself has 32-bit tv_nsec there) but
the code to clear it on compat syscalls in 64-bit kernels seems to be
broken.

The patch to fix this is at

https://lore.kernel.org/lkml/20190429131951.471701-1-arnd@arndb.de/

but it doesn't seem like it has reached Linus yet.


(Hmm.  I think that old ipc and socketcall syscalls in 32-bit kernels
are broken without that patch too.  They would try to read
__kernel_timespec when callers are passing old_timespec32.)
  
Lukasz Majewski May 5, 2019, 8:46 p.m. UTC | #9
On Sun, 5 May 2019 18:10:54 +0400
Stepan Golosunov <stepan@golosunov.pp.ru> wrote:

> 02.05.2019 в 15:04:18 +0000 Joseph Myers написал:
> > On Tue, 30 Apr 2019, Lukasz Majewski wrote:
> >   
> > >  - The need for explicit clearing padding when calling syscalls
> > > (as to be better safe than sorry in the future - there was related
> > >    discussion started by Stepan).  
> > 
> > This really isn't a difficult question.  What it comes down to is
> > whether the Linux kernel, in the first release version with these
> > syscalls (we don't care about old -rc versions; what matters is the
> > actual 5.1 release), ignores the padding.
> > 
> > If 5.1 *release* ignores the padding, that is part of the
> > kernel/userspace ABI, in accordance with the kernel principle of
> > not breaking userspace. Thus, it is something userspace can rely
> > on, now and in the future.
> > 
> > If 5.1 release does not ignore the padding, syscall presence does
> > not mean the padding is ignored by the kernel and so glibc needs to
> > clear padding. Of course, it needs to clear padding in a *copy* of
> > the value provided by the user unless the glibc API in question
> > requires the timespec value in question to be in writable memory.
> > 
> > So, which is (or will be) the case in 5.1 release?  Padding ignored
> > or not?  If more complicated (ignored for some architectures / ABIs
> > but not for others, or depending on whether compat syscalls are in
> > use), then say so - give a precise description of the exact
> > circumstances under which the padding around a 32-bit tv_nsec will
> > or will not be ignored by the kernel on input from userspace.  
> 
> In current linux git it looks like padding is correctly ignored in
> 32-bit kernels (because kernel itself has 32-bit tv_nsec there) but
> the code to clear it on compat syscalls in 64-bit kernels seems to be
> broken.
> 
> The patch to fix this is at
> 
> https://lore.kernel.org/lkml/20190429131951.471701-1-arnd@arndb.de/
> 
> but it doesn't seem like it has reached Linus yet.
> 

I hope that this patch will be pulled soon (before final cut) - for that
reason we can assume that the padding is ignored by the kernel and
hence do not explicitly clear it in glibc (as it was done in sent
patches)

> 
> (Hmm.  I think that old ipc and socketcall syscalls in 32-bit kernels
> are broken without that patch too.  They would try to read
> __kernel_timespec when callers are passing old_timespec32.)

Please correct me if I'm wrong, but this problem is related to x32
machines (and not to ARM 32 bit ones with Y2038).


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Lukasz Majewski May 6, 2019, 1:38 p.m. UTC | #10
Hi Stepan,

> Hi Stepan,
> 
> > 02.05.2019 в 10:51:08 +0200 Lukasz Majewski написал:  
> > > Hi Stepan,
> > >     
> > > > 30.04.2019 в 11:05:05 +0200 Lukasz Majewski написал:    
> > > > > IMHO, the abstraction would be:
> > > > > 
> > > > > 1. The __ASSUME_64BIT_TIME is _never_ defined for 64 bit
> > > > > native systems
> > > > > 
> > > > > 2. It is defined by default in:
> > > > > sysdeps/unix/sysv/linux/kernel-features.h for 32 bit systems
> > > > > (and the actual presence of the syscall is decided upon
> > > > > definitions of __NR_xxx* (i.e. # ifdef
> > > > > __NR_clock_settime64).      
> > > > 
> > > > I think that __NR_clock_settime64 should be used unconditionally
> > > > when __ASSUME_64BIT_TIME is defined.    
> > > 
> > > Could you clarify it a bit?    
> > 
> > I meant something like this:
> > 
> > int
> > __clock_settime64 (clockid_t clock_id, const struct __timespec64
> > *tp) {
> > #ifdef __ASSUME_64BIT_TIME
> >   return INLINE_SYSCALL_CALL (clock_settime64, clock_id, tp);
> > #else
> > …
> > 
> > But I see now that most of the existing code would just miscompile
> > in cases where __ASSUME_* is defined while corresponding __NR_* is
> > not. 
> > > In the code as proposed in:
> > > https://patchwork.ozlabs.org/patch/1092583/
> > > 
> > > The call to clock_settime64 is protected with # ifdef
> > > __NR_clock_settime64 - otherwise we do a fallback to (32 bit)
> > > clock_settime.
> > > 
> > > Moreover, the # ifdef __ASSUME_64BIT_TIME provides a fallback path
> > > if kernel version is older than 5.1.    
> > 
> > The fallback would be wrong in cases where __NR_clock_settime is not
> > defined or is not 32-bit.
> >   
> > > > > As those syscalls are provided on almost every 32 bit system
> > > > > now (5.1-rc6):
> > > > > git grep -n "clock_settime64"
> > > > > 
> > > > > gives support for: arm, arm64 (compat mode), m68k, microblaze,
> > > > > mips, parisc, powerpc, s390, sh, sparc, x86, xtensa
> > > > > 
> > > > > So it would be reasonable to just add this __ASSUME definition
> > > > > code to sysdeps/unix/sysv/linux/kernel-features.h and #undef
> > > > > it for architectures not supporting it (i.e. c-sky and
> > > > > riscv).      
> > > > 
> > > > I believe that the only 32-bit architecture without
> > > > __NR_clock_settime64 is x32.     
> > > 
> > > Ok, I see. 
> > > 
> > > Please correct me - would it be feasible to just #undef
> > > __ASSYME_TIME64_SYSCALLS for x32 ?    
> > 
> > You'll need to know whether to use __NR_clock_settime64 or
> > __NR_clock_settime in cases where __TIMESIZE == 64 and
> > __WORDSIZE == 32.

Please correct me, but I do have some doubts here.

As x32 now uses 64 bit time (and has TIMESIZE==64) - it uses the
clock_settime call (with in-kernel broken tv_nsec padding clearing -
but for this the fix is in its way to upstream).

Why does it need to also support clock_settime64 ? 

> > 
> > One way would be by defining __ASSUME_TIME64_SYSCALLS
> > unconditionally on x32 and then defining __NR_clock_settime64 to
> > __NR_clock_settime when __ASSUME_TIME64_SYSCALLS is defined while
> > __NR_clock_settime64 isn't.
> >   
> 
> I see. Thanks for the hint.
> 
> > > > While newer 32-bit architectures like
> > > > riscv do not have __NR_clock_settime:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d4c08b9776b392e20efc6198ebe1bc8ec1911d9b    
> > > 
> > > Then it shall use clock_settime64 from the outset if support
> > > added.    
> > 
> > It probably should have __TIMESIZE == 64 though.  
> 
> 
> Best regards,
> 
> Lukasz Majewski
> 
> --
> 
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> lukma@denx.de




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Lukasz Majewski May 6, 2019, 2:26 p.m. UTC | #11
Hi Stepan,

> Hi Stepan,
> 
> > Hi Stepan,
> >   
> > > 02.05.2019 в 10:51:08 +0200 Lukasz Majewski написал:    
> > > > Hi Stepan,
> > > >       
> > > > > 30.04.2019 в 11:05:05 +0200 Lukasz Majewski написал:      
> > > > > > IMHO, the abstraction would be:
> > > > > > 
> > > > > > 1. The __ASSUME_64BIT_TIME is _never_ defined for 64 bit
> > > > > > native systems
> > > > > > 
> > > > > > 2. It is defined by default in:
> > > > > > sysdeps/unix/sysv/linux/kernel-features.h for 32 bit systems
> > > > > > (and the actual presence of the syscall is decided upon
> > > > > > definitions of __NR_xxx* (i.e. # ifdef
> > > > > > __NR_clock_settime64).        
> > > > > 
> > > > > I think that __NR_clock_settime64 should be used
> > > > > unconditionally when __ASSUME_64BIT_TIME is defined.      
> > > > 
> > > > Could you clarify it a bit?      
> > > 
> > > I meant something like this:
> > > 
> > > int
> > > __clock_settime64 (clockid_t clock_id, const struct __timespec64
> > > *tp) {
> > > #ifdef __ASSUME_64BIT_TIME
> > >   return INLINE_SYSCALL_CALL (clock_settime64, clock_id, tp);
> > > #else
> > > …
> > > 
> > > But I see now that most of the existing code would just miscompile
> > > in cases where __ASSUME_* is defined while corresponding __NR_* is
> > > not.   
> > > > In the code as proposed in:
> > > > https://patchwork.ozlabs.org/patch/1092583/
> > > > 
> > > > The call to clock_settime64 is protected with # ifdef
> > > > __NR_clock_settime64 - otherwise we do a fallback to (32 bit)
> > > > clock_settime.
> > > > 
> > > > Moreover, the # ifdef __ASSUME_64BIT_TIME provides a fallback
> > > > path if kernel version is older than 5.1.      
> > > 
> > > The fallback would be wrong in cases where __NR_clock_settime is
> > > not defined or is not 32-bit.
> > >     
> > > > > > As those syscalls are provided on almost every 32 bit system
> > > > > > now (5.1-rc6):
> > > > > > git grep -n "clock_settime64"
> > > > > > 
> > > > > > gives support for: arm, arm64 (compat mode), m68k,
> > > > > > microblaze, mips, parisc, powerpc, s390, sh, sparc, x86,
> > > > > > xtensa
> > > > > > 
> > > > > > So it would be reasonable to just add this __ASSUME
> > > > > > definition code to
> > > > > > sysdeps/unix/sysv/linux/kernel-features.h and #undef it for
> > > > > > architectures not supporting it (i.e. c-sky and
> > > > > > riscv).        
> > > > > 
> > > > > I believe that the only 32-bit architecture without
> > > > > __NR_clock_settime64 is x32.       
> > > > 
> > > > Ok, I see. 
> > > > 
> > > > Please correct me - would it be feasible to just #undef
> > > > __ASSYME_TIME64_SYSCALLS for x32 ?      
> > > 
> > > You'll need to know whether to use __NR_clock_settime64 or
> > > __NR_clock_settime in cases where __TIMESIZE == 64 and
> > > __WORDSIZE == 32.  
> 
> Please correct me, but I do have some doubts here.
> 
> As x32 now uses 64 bit time (and has TIMESIZE==64) - it uses the
> clock_settime call (with in-kernel broken tv_nsec padding clearing -
> but for this the fix is in its way to upstream).
> 
> Why does it need to also support clock_settime64 ? 

I was too eager to send the mail.

It is connected with your proposition to use __WORDSIZE for #ifdef on
__ASSUME_TIME64_SYSCALLS. 

As x32 has __WORDSIZE=32 (but __TIMESIZE=64), it would fall into
"category" of archs using explicit 64 bit calls (i.e. clock_settime64).

However, for it - those shall be replaced with syscalls used up till now
(i.e. clock_settime).

Am I right here ?

> 
> > > 
> > > One way would be by defining __ASSUME_TIME64_SYSCALLS
> > > unconditionally on x32 and then defining __NR_clock_settime64 to
> > > __NR_clock_settime when __ASSUME_TIME64_SYSCALLS is defined while
> > > __NR_clock_settime64 isn't.
> > >     
> > 
> > I see. Thanks for the hint.
> >   
> > > > > While newer 32-bit architectures like
> > > > > riscv do not have __NR_clock_settime:
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d4c08b9776b392e20efc6198ebe1bc8ec1911d9b      
> > > > 
> > > > Then it shall use clock_settime64 from the outset if support
> > > > added.      
> > > 
> > > It probably should have __TIMESIZE == 64 though.    
> > 
> > 
> > Best regards,
> > 
> > Lukasz Majewski
> > 
> > --
> > 
> > DENX Software Engineering GmbH,      Managing Director: Wolfgang
> > Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell,
> > Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> > lukma@denx.de  
> 
> 
> 
> 
> Best regards,
> 
> Lukasz Majewski
> 
> --
> 
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> lukma@denx.de




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Lukasz Majewski May 6, 2019, 2:56 p.m. UTC | #12
Hi Joseph,

> On Tue, 30 Apr 2019, Lukasz Majewski wrote:
> 
> >  - The need for explicit clearing padding when calling syscalls (as
> > to be better safe than sorry in the future - there was related
> >    discussion started by Stepan).  
> 
> This really isn't a difficult question.  What it comes down to is
> whether the Linux kernel, in the first release version with these
> syscalls (we don't care about old -rc versions; what matters is the
> actual 5.1 release), ignores the padding.
> 
> If 5.1 *release* ignores the padding, that is part of the
> kernel/userspace ABI, in accordance with the kernel principle of not
> breaking userspace. Thus, it is something userspace can rely on, now
> and in the future.
> 
> If 5.1 release does not ignore the padding, syscall presence does not
> mean the padding is ignored by the kernel and so glibc needs to clear
> padding. Of course, it needs to clear padding in a *copy* of the
> value provided by the user unless the glibc API in question requires
> the timespec value in question to be in writable memory.
> 
> So, which is (or will be) the case in 5.1 release?  Padding ignored
> or not?

As confirmed in the other mail - the padding is ignored in Linux kernel
(and the fix patch for x32 is up its way to be pulled).

>  If more complicated (ignored for some architectures / ABIs
> but not for others, or depending on whether compat syscalls are in
> use), then say so - give a precise description of the exact
> circumstances under which the padding around a 32-bit tv_nsec will or
> will not be ignored by the kernel on input from userspace.
> 
> (x32 is a separate matter, as it already has 64-bit times, and a 
> non-POSIX-conforming tv_nsec, so this patch series just needs to
> avoid breaking anything that currently works there.  Any fix for bug
> 16437 would need to involve clearing padding in userspace, unless not
> only the kernel changes to ignore that padding but all kernel
> versions without such a change cease to be supported by glibc for
> x32.)
> 
> > You are right here - the 
> > 
> > #if __TIMESIZE != 64
> > # if __LINUX_KERNEL_VERSION >= 0x050100
> > #  define __ASSUME_64BIT_TIME 1
> > # endif
> > #endif
> > 
> > would do the trick.  
> 
> But that wouldn't be right for *future* configurations where the
> kernel "long" is 32-bit but only 64-bit time is supported in the
> kernel and glibc (so __TIMESIZE is 64, and only the new syscalls and
> not the old ones are supported).  That is, the right abstraction here
> is not really __TIMESIZE.
> 
> It's possible it's __SYSCALL_WORDSIZE, except that's only defined for 
> x86_64, so would need to be made more generally available if it's to
> be used here.  And if made more generally available, it would need a
> careful comment explaining exactly what it means.

Cannot we just use __WORDSIZE != 64  as proposed by Stepan?
(for x32 we would have it defined by default)

#if __WORDSIZE != 64
# if __LINUX_KERNEL_VERSION >= 0x050100
#  define __ASSUME_TIME64_SYSCALLS 1
# endif
#endif  

Such approach would allow us to avoid introducing new abstraction
(__SYSCALL_WORDSIZE).

As of now only x32 has __WORDSIZE=32 and __TIMESIZE=64 and would be
treated as a special case with solution proposed by Stepan in the
other mail:

---->8-----
One way would be by defining __ASSUME_TIME64_SYSCALLS unconditionally
on x32 and then defining __NR_clock_settime64 to __NR_clock_settime
when __ASSUME_TIME64_SYSCALLS is defined while __NR_clock_settime64
isn't.
----8<-----

Or even simpler:
#undef __ASSUME_TIME64_SYSCALLS for x32 (with proper comment)

x32 is special here - if (unlikely) some other arch with __WORDSIZE=32
and __TIMESIZE=64 emerge - it shall follow the same pattern




For __WORDSIZE/__TIMESIZE=32 and __WORDSIZE/__TIMESIZE=64 archs we
would have a clear situation.


> 
> > 1. The __ASSUME_64BIT_TIME is _never_ defined for 64 bit native
> > systems
> > 
> > 2. It is defined by default in:
> > sysdeps/unix/sysv/linux/kernel-features.h for 32 bit systems (and
> > the  
> 
> It would be best to avoid descriptions such as "64 bit native
> systems" and "32 bit systems" when defining the relevant
> abstractions, because they are simply too ambiguous in this context,
> where one of the key thing to do is to make the semantics obvious in
> cases that have some attributes of 32-bit systems and other
> attributes of 64-bit systems.
> 
> We have configurations such as x32 and MIPS n32 which have 64-bit 
> registers but 32-bit "long" and pointers.  Are those 64-bit or
> 32-bit?  As far as glibc's definition of __WORDSIZE is concerned,
> they are 32-bit; __WORDSIZE is the size in bits of long and
> pointers.  As far as optimized code working one word at a time is
> concerned (libm functions, string functions, etc.), they are best
> considered 64-bit, because of the 64-bit registers.  For the present
> issue, they are *different* from each other: x32 does not have the
> new syscalls (it already had 64-bit times), n32 does have the new
> syscalls (it previously had 32-bit times).
> 
> Again, I think the size of __syscall_slong_t is probably what's
> relevant. Note that "size of long for syscalls" (which is 64-bit for
> x32 but 32-bit for n32) is *not* the same thing as "size passed in a
> single register for syscalls" (n32 passes 64-bits values in a single
> register to syscalls, on the principle of keeping the ABI for those
> similar to that for normal function calls; but there have been more
> recent suggestions in the kernel context - I don't know the
> conclusion from them - of whether future such ILP32 ABIs with 64-bit
> registers should be more similar to the ABIs using 32-bit registers,
> to allow compat syscall code to be used for them more consistently).
> 
> > As those syscalls are provided on almost every 32 bit system now
> > (5.1-rc6):
> > git grep -n "clock_settime64"
> > 
> > gives support for: arm, arm64 (compat mode), m68k, microblaze, mips,
> > parisc, powerpc, s390, sh, sparc, x86, xtensa
> > 
> > So it would be reasonable to just add this __ASSUME definition code
> > to sysdeps/unix/sysv/linux/kernel-features.h and #undef it for
> > architectures not supporting it (i.e. c-sky and riscv).  
> 
> No, that's not accurate.  Newer architectures (such as csky and
> riscv) use the asm-generic syscall table and so get these syscalls
> automatically if __BITS_PER_LONG == 32.  So it would be wrong to
> #undef in those cases.
> 
> When checking each glibc architecture / ABI combination, to see if
> the syscalls are present in the kernel, you need to allow for some 
> architectures using asm-generic (which means that for such
> architectures you only need to check the generic logic, then look at
> any compat syscall tables, such as for arm on arm64).  For
> architectures not using asm-generic you need to check the
> per-architecture syscalls tables for each relevant ABI.
> 




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Stepan Golosunov May 6, 2019, 7:36 p.m. UTC | #13
06.05.2019 в 16:26:59 +0200 Lukasz Majewski написал:

> > > > > > I believe that the only 32-bit architecture without
> > > > > > __NR_clock_settime64 is x32.       
> > > > > 
> > > > > Ok, I see. 
> > > > > 
> > > > > Please correct me - would it be feasible to just #undef
> > > > > __ASSYME_TIME64_SYSCALLS for x32 ?      
> > > > 
> > > > You'll need to know whether to use __NR_clock_settime64 or
> > > > __NR_clock_settime in cases where __TIMESIZE == 64 and
> > > > __WORDSIZE == 32.  
> > 
> > Please correct me, but I do have some doubts here.
> > 
> > As x32 now uses 64 bit time (and has TIMESIZE==64) - it uses the
> > clock_settime call (with in-kernel broken tv_nsec padding clearing -
> > but for this the fix is in its way to upstream).
> > 
> > Why does it need to also support clock_settime64 ? 
> 
> I was too eager to send the mail.
> 
> It is connected with your proposition to use __WORDSIZE for #ifdef on
> __ASSUME_TIME64_SYSCALLS. 
> 
> As x32 has __WORDSIZE=32 (but __TIMESIZE=64), it would fall into
> "category" of archs using explicit 64 bit calls (i.e. clock_settime64).
> 
> However, for it - those shall be replaced with syscalls used up till now
> (i.e. clock_settime).
> 
> Am I right here ?

x32 has __WORDSIZE==32 and __TIMESIZE==64.  Future 32-bit
architectures should have __WORDSIZE==32 and __TIMESIZE==64 too.  And
the only difference in implementation of clock_settime64 function there
should be name of syscall (ignoring possible padding clearing
issues).

> > > > One way would be by defining __ASSUME_TIME64_SYSCALLS
> > > > unconditionally on x32 and then defining __NR_clock_settime64 to
> > > > __NR_clock_settime when __ASSUME_TIME64_SYSCALLS is defined while
> > > > __NR_clock_settime64 isn't.

I think now that with this scheme __ASSUME_TIME64_SYSCALLS should be
defined unconditionally for the __WORDSIZE==64 case too.  With this
there will be no need to use __TIMESIZE inside clock_settime64
function.  __TIMESIZE describes interfaces provided by glibc, and has
no relation with kernel interfaces used by glibc.

#ifdef __ASSUME_TIME64_SYSCALLS
  call __NR_clock_settime64 (possibly defined to __NR_clock_settime)
#else
  call __NR_clock_settime64 with fallback to __NR_clock_settime on ENOSYS
#endif
  
Lukasz Majewski May 6, 2019, 9:14 p.m. UTC | #14
Hi Stepan,

> 06.05.2019 в 16:26:59 +0200 Lukasz Majewski написал:
> 
> > > > > > > I believe that the only 32-bit architecture without
> > > > > > > __NR_clock_settime64 is x32.         
> > > > > > 
> > > > > > Ok, I see. 
> > > > > > 
> > > > > > Please correct me - would it be feasible to just #undef
> > > > > > __ASSYME_TIME64_SYSCALLS for x32 ?        
> > > > > 
> > > > > You'll need to know whether to use __NR_clock_settime64 or
> > > > > __NR_clock_settime in cases where __TIMESIZE == 64 and
> > > > > __WORDSIZE == 32.    
> > > 
> > > Please correct me, but I do have some doubts here.
> > > 
> > > As x32 now uses 64 bit time (and has TIMESIZE==64) - it uses the
> > > clock_settime call (with in-kernel broken tv_nsec padding
> > > clearing - but for this the fix is in its way to upstream).
> > > 
> > > Why does it need to also support clock_settime64 ?   
> > 
> > I was too eager to send the mail.
> > 
> > It is connected with your proposition to use __WORDSIZE for #ifdef
> > on __ASSUME_TIME64_SYSCALLS. 
> > 
> > As x32 has __WORDSIZE=32 (but __TIMESIZE=64), it would fall into
> > "category" of archs using explicit 64 bit calls (i.e.
> > clock_settime64).
> > 
> > However, for it - those shall be replaced with syscalls used up
> > till now (i.e. clock_settime).
> > 
> > Am I right here ?  
> 
> x32 has __WORDSIZE==32 and __TIMESIZE==64.  Future 32-bit
> architectures should have __WORDSIZE==32 and __TIMESIZE==64 too.  And
> the only difference in implementation of clock_settime64 function
> there should be name of syscall (ignoring possible padding clearing
> issues).

For ARM, x86 there shall be call to clock_settime64 syscall

For x32 there shall be call to clock_settime syscall (which is
supporting 64 bit anyway - despite the ignoring possible padding
clearing issue).

> 
> > > > > One way would be by defining __ASSUME_TIME64_SYSCALLS
> > > > > unconditionally on x32 and then defining __NR_clock_settime64
> > > > > to __NR_clock_settime when __ASSUME_TIME64_SYSCALLS is
> > > > > defined while __NR_clock_settime64 isn't.  
> 
> I think now that with this scheme __ASSUME_TIME64_SYSCALLS should be
> defined unconditionally for the __WORDSIZE==64 case too.  With this
> there will be no need to use __TIMESIZE inside clock_settime64
> function.  __TIMESIZE describes interfaces provided by glibc, and has
> no relation with kernel interfaces used by glibc.
> 
> #ifdef __ASSUME_TIME64_SYSCALLS
>   call __NR_clock_settime64 (possibly defined to __NR_clock_settime)
> #else
>   call __NR_clock_settime64 with fallback to __NR_clock_settime on
> ENOSYS #endif

I rather thought about something like:

__clock_settime64 (clockid_t clock_id, const struct __timespec64 *tp)
{
  <overflow check>

#ifdef __ASSUME_TIME64_SYSCALLS
# ifdef __NR_clock_settime64
  int ret = INLINE_SYSCALL_CALL (clock_settime64, clock_id, tp);
  if (ret == 0 || errno != ENOSYS)
	return ret;
# endif
  /* Fall back to syscall supporting 32bit struct timespec.  */
  struct timespec ts32;
  valid_timespec64_to_timespec (tp, &ts32);
  return INLINE_SYSCALL_CALL (clock_settime, clock_id, &ts32);
#else
  /* 64 bit machines + x32 */
  return INLINE_SYSCALL_CALL (clock_settime, clock_id, tp);
#endif
}

In the above pseudo code we assume that __ASSUME_TIME64_SYSCALLS is
#undef'ed for x32 (so it is treated as a 'special case' - in the same
way as x86_64).



Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Joseph Myers May 7, 2019, 3:35 p.m. UTC | #15
On Mon, 6 May 2019, Lukasz Majewski wrote:

> > So, which is (or will be) the case in 5.1 release?  Padding ignored
> > or not?
> 
> As confirmed in the other mail - the padding is ignored in Linux kernel
> (and the fix patch for x32 is up its way to be pulled).

Did the patch to ignore padding (for compat syscalls under 64-bit kernels, 
non-x32) make it into the final 5.1 release?

> > It's possible it's __SYSCALL_WORDSIZE, except that's only defined for 
> > x86_64, so would need to be made more generally available if it's to
> > be used here.  And if made more generally available, it would need a
> > careful comment explaining exactly what it means.
> 
> Cannot we just use __WORDSIZE != 64  as proposed by Stepan?
> (for x32 we would have it defined by default)
> 
> #if __WORDSIZE != 64
> # if __LINUX_KERNEL_VERSION >= 0x050100
> #  define __ASSUME_TIME64_SYSCALLS 1
> # endif
> #endif  
> 
> Such approach would allow us to avoid introducing new abstraction
> (__SYSCALL_WORDSIZE).

There are lots of implementation possibilities, including using 
__WORDSIZE and undefining for x32, or using __WORDSIZE and defining the 
__NR_* macros locally for x32.

At the present stage of review, I'm not really concerned with such 
implementation details.  Rather, I'm reviewing things at the level of 
concepts and abstractions, and how we document those concepts and 
abstractions.  The things we need to define here are:

* What are the sets of syscalls related to 64-bit time, with the property 
that, in any given kernel, either all the syscalls in such a set are 
present, or all the syscalls in such a set are absent?

Right now, maybe there's only one such set, and the subset of it being 
used in glibc can be defined by listing the syscalls.  In future, there 
may well be more than one such set - for example, if syscalls that 
currently only have versions using timeval get new versions using timespec 
that are added even on architectures where the pre-5.1 kernel syscall 
interface already uses 64-bit time.

* For each set of syscalls, what is the *best* logical description of the 
set of kernel configurations that has those syscalls?  There may be many 
equivalent descriptions of the current set of such configurations; we need 
to find one that is correct, unambiguous, succinct, close to the actual 
logic in the kernel that determines whether those syscalls are present, 
and has the least chance of being inaccurate for future kernel 
configurations.

Once we have those concepts clearly defined, we can then look at how to 
translate them into code.

Since (a) future architectures all use asm-generic and (b) the asm-generic 
code uses __BITS_PER_LONG == 32, that suggests something in terms of the 
size of "long" is the right way to describe the condition for the presence 
of the syscalls - at which point you then need to discuss how the size of 
"long" can vary between ABIs, and how in the x32 case the size of "long" 
in the syscall interface is not the same as the normal size of that C 
type.

Within glibc, __WORDSIZE corresponds to the size of "long", so that in 
turn suggests an implementation approach based on __WORDSIZE in some way, 
if you have an exception for x32.
  
Joseph Myers May 7, 2019, 3:43 p.m. UTC | #16
On Mon, 6 May 2019, Stepan Golosunov wrote:

> > > > > One way would be by defining __ASSUME_TIME64_SYSCALLS
> > > > > unconditionally on x32 and then defining __NR_clock_settime64 to
> > > > > __NR_clock_settime when __ASSUME_TIME64_SYSCALLS is defined while
> > > > > __NR_clock_settime64 isn't.
> 
> I think now that with this scheme __ASSUME_TIME64_SYSCALLS should be
> defined unconditionally for the __WORDSIZE==64 case too.  With this

That's certainly one possibility for how to implement things - define all 
the __NR_*64 to the older __NR_* in the case where the older __NR_* 
already handle 64-bit time.

If that approach is used, the comment on __ASSUME_TIME64_SYSCALLS then 
needs to describe *two* different concepts very carefully, without 
conflating them: the set of syscall ABIs with the new syscalls, and the 
set of syscall ABIs which have 64-bit time functionality for a given set 
of interfaces, with the __NR_*64 names for those interfaces being 
available within glibc (but where it might be either old or new syscalls 
behind those names in glibc).
  
Stepan Golosunov May 7, 2019, 8:03 p.m. UTC | #17
06.05.2019 в 23:14:44 +0200 Lukasz Majewski написал:
> For ARM, x86 there shall be call to clock_settime64 syscall
> 
> For x32 there shall be call to clock_settime syscall (which is
> supporting 64 bit anyway - despite the ignoring possible padding
> clearing issue).

> I rather thought about something like:
> 
> __clock_settime64 (clockid_t clock_id, const struct __timespec64 *tp)
> {
>   <overflow check>
> 
> #ifdef __ASSUME_TIME64_SYSCALLS
> # ifdef __NR_clock_settime64
>   int ret = INLINE_SYSCALL_CALL (clock_settime64, clock_id, tp);
>   if (ret == 0 || errno != ENOSYS)
> 	return ret;
> # endif
>   /* Fall back to syscall supporting 32bit struct timespec.  */
>   struct timespec ts32;
>   valid_timespec64_to_timespec (tp, &ts32);
>   return INLINE_SYSCALL_CALL (clock_settime, clock_id, &ts32);
> #else
>   /* 64 bit machines + x32 */
>   return INLINE_SYSCALL_CALL (clock_settime, clock_id, tp);
> #endif
> }
> 
> In the above pseudo code we assume that __ASSUME_TIME64_SYSCALLS is
> #undef'ed for x32 (so it is treated as a 'special case' - in the same
> way as x86_64).

Usually the point of __ASSUME_* macros is to avoid compiling old
compatibility code when it's not needed.  That means there should be
no fallback to 32-bit clock_settime when __ASSUME_TIME64_SYSCALLS is
defined.  Even when it's defined on 32-bit arm or x86.

And with your current use of __ASSUME_TIME64_SYSCALLS
__NR_clock_settime64 won't be used on any existing architecture.
Because __ASSUME_TIME64_SYSCALLS won't be defined until Linux 5.0 is
no longer supported.

If you replace

#ifdef __ASSUME_TIME64_SYSCALLS

with

#ifdef __NR_clock_settime64

result would make more sense than what you have now.
  
Joseph Myers May 7, 2019, 8:44 p.m. UTC | #18
On Wed, 8 May 2019, Stepan Golosunov wrote:

> If you replace
> 
> #ifdef __ASSUME_TIME64_SYSCALLS
> 
> with
> 
> #ifdef __NR_clock_settime64
> 
> result would make more sense than what you have now.

Actually I'd expect it to start something more like

#ifdef __ASSUME_TIME64_SYSCALLS
  return INLINE_SYSCALL_CALL (clock_settime64, clock_id, tp);
#elif defined __NR_clock_settime64

to behave as expected for both __ASSUME_TIME64_SYSCALLS, and the case 
where __NR_clock_settime64 is defined but __ASSUME_TIME64_SYSCALLS is not.  
(The correct fallback when __NR_clock_settime64 is not defined depends on 
the size of time_t as used in the old syscalls.  Note that you *cannot* 
assume that 5.1 kernel headers are in use, so __NR_clock_settime64 
undefined might means a system where the syscall ABI already uses 64-bit 
times with clock_settime, or it might mean one where the syscall ABI uses 
32-bit times with clock_settime but old kernel headers are in use.)
  
Lukasz Majewski May 8, 2019, 7:51 a.m. UTC | #19
Hi Stepan,

> 06.05.2019 в 23:14:44 +0200 Lukasz Majewski написал:
> > For ARM, x86 there shall be call to clock_settime64 syscall
> > 
> > For x32 there shall be call to clock_settime syscall (which is
> > supporting 64 bit anyway - despite the ignoring possible padding
> > clearing issue).  
> 
> > I rather thought about something like:
> > 
> > __clock_settime64 (clockid_t clock_id, const struct __timespec64
> > *tp) {
> >   <overflow check>
> > 
> > #ifdef __ASSUME_TIME64_SYSCALLS
> > # ifdef __NR_clock_settime64
> >   int ret = INLINE_SYSCALL_CALL (clock_settime64, clock_id, tp);
> >   if (ret == 0 || errno != ENOSYS)
> > 	return ret;
> > # endif
> >   /* Fall back to syscall supporting 32bit struct timespec.  */
> >   struct timespec ts32;
> >   valid_timespec64_to_timespec (tp, &ts32);
> >   return INLINE_SYSCALL_CALL (clock_settime, clock_id, &ts32);
> > #else
> >   /* 64 bit machines + x32 */
> >   return INLINE_SYSCALL_CALL (clock_settime, clock_id, tp);
> > #endif
> > }
> > 
> > In the above pseudo code we assume that __ASSUME_TIME64_SYSCALLS is
> > #undef'ed for x32 (so it is treated as a 'special case' - in the
> > same way as x86_64).  
> 
> Usually the point of __ASSUME_* macros is to avoid compiling old
> compatibility code when it's not needed.  That means there should be
> no fallback to 32-bit clock_settime when __ASSUME_TIME64_SYSCALLS is
> defined.  Even when it's defined on 32-bit arm or x86.
> 
> And with your current use of __ASSUME_TIME64_SYSCALLS
> __NR_clock_settime64 won't be used on any existing architecture.
> Because __ASSUME_TIME64_SYSCALLS won't be defined until Linux 5.0 is
> no longer supported.

It will be used when the minimal supported kernel reaches 5.1. (now it
is 3.2.)

For testing I do use --enable-kernel="5.1.0" passed to configure.

> 
> If you replace
> 
> #ifdef __ASSUME_TIME64_SYSCALLS
> 
> with
> 
> #ifdef __NR_clock_settime64
> 
> result would make more sense than what you have now.




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Lukasz Majewski May 8, 2019, 10:18 a.m. UTC | #20
Hi Joseph,

> On Mon, 6 May 2019, Lukasz Majewski wrote:
> 
> > > So, which is (or will be) the case in 5.1 release?  Padding
> > > ignored or not?  
> > 
> > As confirmed in the other mail - the padding is ignored in Linux
> > kernel (and the fix patch for x32 is up its way to be pulled).  
> 
> Did the patch to ignore padding (for compat syscalls under 64-bit
> kernels, non-x32) make it into the final 5.1 release?

As fair as I can tell, it was not pulled to 5.1.

> 
> > > It's possible it's __SYSCALL_WORDSIZE, except that's only defined
> > > for x86_64, so would need to be made more generally available if
> > > it's to be used here.  And if made more generally available, it
> > > would need a careful comment explaining exactly what it means.  
> > 
> > Cannot we just use __WORDSIZE != 64  as proposed by Stepan?
> > (for x32 we would have it defined by default)
> > 
> > #if __WORDSIZE != 64
> > # if __LINUX_KERNEL_VERSION >= 0x050100
> > #  define __ASSUME_TIME64_SYSCALLS 1
> > # endif
> > #endif  
> > 
> > Such approach would allow us to avoid introducing new abstraction
> > (__SYSCALL_WORDSIZE).  
> 
> There are lots of implementation possibilities, including using 
> __WORDSIZE and undefining for x32, or using __WORDSIZE and defining
> the __NR_* macros locally for x32.
> 
> At the present stage of review, I'm not really concerned with such 
> implementation details.  Rather, I'm reviewing things at the level of 
> concepts and abstractions, and how we document those concepts and 
> abstractions.  The things we need to define here are:
> 
> * What are the sets of syscalls related to 64-bit time, with the
> property that, in any given kernel, either all the syscalls in such a
> set are present, or all the syscalls in such a set are absent?

The 64 bit versions of syscalls (like clock_settime64 or
clock_gettime64) are available since 5.1 kernel (as you posted already
the following patch: "Update syscall-names.list for Linux 5.1" .

I've now only focused on clock_settime(64) to make the discussion more
concrete.

> 
> Right now, maybe there's only one such set, and the subset of it
> being used in glibc can be defined by listing the syscalls.  In
> future, there may well be more than one such set - for example, if
> syscalls that currently only have versions using timeval get new
> versions using timespec that are added even on architectures where
> the pre-5.1 kernel syscall interface already uses 64-bit time.

This may also happen, yes.

> 
> * For each set of syscalls, what is the *best* logical description of
> the set of kernel configurations that has those syscalls? 

IMHO, the description is as follows:

"Time related syscalls with explicit 64 bit time support to be run on
archs with 32 bit pointer/long (__WORDSIZE==32)"

Hence the __ASSUME_TIME64_SYSCALLS seems to be a descriptive name (as
even in the kernel some syscalls end with _time64 suffix - i.e.
sched_rr_get_interval_time64) .

> There may
> be many equivalent descriptions of the current set of such
> configurations; we need to find one that is correct, unambiguous,
> succinct, close to the actual logic in the kernel that determines
> whether those syscalls are present, and has the least chance of being
> inaccurate for future kernel configurations.
> 

As stated above - those syscalls are supposed to be used on systems
with __WORDSIZE==32 [*] to provide support for 64 bit time (Y2038 safe).

> Once we have those concepts clearly defined, we can then look at how
> to translate them into code.
> 
> Since (a) future architectures all use asm-generic and (b) the
> asm-generic code uses __BITS_PER_LONG == 32, that suggests something
> in terms of the size of "long" is the right way to describe the
> condition for the presence of the syscalls - at which point you then
> need to discuss how the size of "long" can vary between ABIs, and how
> in the x32 case the size of "long" in the syscall interface is not
> the same as the normal size of that C type.
> 
> Within glibc, __WORDSIZE corresponds to the size of "long", so that
> in turn suggests an implementation approach based on __WORDSIZE in
> some way,

That was the approach took in v3 of the patch set [1] (I do send patches
to show the big picture about the idea and most of all - IMHO it is
easier to discuss the concept with some code available)

>  if you have an exception for x32.
> 

Yes, x32 is special :-) (I'm wondering though what is the percentage of
glibc users, who use this particular ABI...)

Note:

[*] - x32 is a special case and (as proposed in the v3 patch set) shall
#undef the __ASSUME_TIME64_SYSCALLS as it uses syscalls from x86_64

[1] - https://patchwork.ozlabs.org/patch/1096343/

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Stepan Golosunov May 8, 2019, 7:48 p.m. UTC | #21
08.05.2019 в 09:51:23 +0200 Lukasz Majewski написал:
> > Usually the point of __ASSUME_* macros is to avoid compiling old
> > compatibility code when it's not needed.  That means there should be
> > no fallback to 32-bit clock_settime when __ASSUME_TIME64_SYSCALLS is
> > defined.  Even when it's defined on 32-bit arm or x86.
> > 
> > And with your current use of __ASSUME_TIME64_SYSCALLS
> > __NR_clock_settime64 won't be used on any existing architecture.
> > Because __ASSUME_TIME64_SYSCALLS won't be defined until Linux 5.0 is
> > no longer supported.
> 
> It will be used when the minimal supported kernel reaches 5.1. (now it
> is 3.2.)
> 
> For testing I do use --enable-kernel="5.1.0" passed to configure.

But __NR_clock_settime64 should be used even if minimal supported
kernel is 3.2.  If it was available in kernel headers and did not
return ENOSYS.

Otherwise glibc released, say, in 2025 still wouldn't be using
__NR_clock_settime64 on 32-bit x86 even when running on the latest
kernel.  Because minimal supported kernel version would be something
like 4.19 then.

Minimal supported kernel version, version of kernel headers used when
compiling glibc and version of kernel glibc is running on—are 3
different kernel versions normally.

What changes when minimal supported kernel becomes 5.1 is the fact
that __NR_clock_settime64 can no longer fail with ENOSYS and thus the
need for fallback to 32-bit __NR_clock_settime disappears.


You need to handle the following 4 cases:

1.  32-bit syscall should not be called (and may not even exist):
1.1.  __NR_clock_settime is 64 bit.
1.2.  __NR_clock_settime64 works in all supported kernels.

2.  __NR_clock_settime is 32-bit and __NR_clock_settime64 may be
unavailable.
2.1.  __NR_clock_settime64 is available in kernel headers, but may
fail with ENOSYS when called.  32-bit __NR_clock_settime should be
called if __NR_clock_settime64 is failing.
2.2.  __NR_clock_settime64 is not present in kernel headers.
32-bit __NR_clock_settime should be called.
  
Joseph Myers May 9, 2019, 3:46 p.m. UTC | #22
On Wed, 8 May 2019, Lukasz Majewski wrote:

> The 64 bit versions of syscalls (like clock_settime64 or
> clock_gettime64) are available since 5.1 kernel (as you posted already
> the following patch: "Update syscall-names.list for Linux 5.1" .
> 
> I've now only focused on clock_settime(64) to make the discussion more
> concrete.

The following will be relevant for use of clock_gettime64, but is not 
immediately relevant for clock_settime64:

clock_gettime64 will complicate things because the present clock_gettime 
code uses the vDSO on some architectures.  Is clock_gettime64 available in 
the vDSO in 5.1 on all architectures where clock_gettime is?  If it is, 
with the same symbol version as used for existing vDSO symbols, or a 
different symbol version?  If not in the vDSO, are there any performance 
implications from using a clock_gettime64 syscall in place of a 
clock_gettime call to the vDSO?

(I think the code using the vDSO will automatically fall back to a 
corresponding syscall if the vDSO symbol isn't there, but answers to those 
questions will still be relevant for reviewing any patch for 
clock_gettime64 and understanding exactly what code paths it will use.)
  
Joseph Myers May 9, 2019, 4:12 p.m. UTC | #23
On Wed, 8 May 2019, Stepan Golosunov wrote:

> You need to handle the following 4 cases:

And, to be clear:

These cases are ones to think about when writing a patch in this area.

They are not cases that should be enumerated in comments in the code for 
any particular function, because they are completely standard for how 
these syscalls are handled across all such functions and comments in a 
function should be about things specific to that particular function - we 
don't want repetitive comments that all describe the same generic things.  
Most of this is also generic to *any* case in glibc where we have optional 
support for using a new syscall, with fallback code when the syscall can't 
be assumed to be supported.

If those general rules about optional support for new syscalls are to be 
documented anywhere in glibc, I think maint.texi would be the right place 
for such documentation.

If some function has a good reason for *not* following the generic pattern 
you described, *that* would be a case for having a comment in the function 
explaining why it's not implemented in the normal way.
  
Lukasz Majewski May 15, 2019, 3:39 p.m. UTC | #24
Hi Joseph,

> On Wed, 8 May 2019, Lukasz Majewski wrote:
> 
> > The 64 bit versions of syscalls (like clock_settime64 or
> > clock_gettime64) are available since 5.1 kernel (as you posted
> > already the following patch: "Update syscall-names.list for Linux
> > 5.1" .
> > 
> > I've now only focused on clock_settime(64) to make the discussion
> > more concrete.  
> 
> The following will be relevant for use of clock_gettime64, but is not 
> immediately relevant for clock_settime64:
> 
> clock_gettime64 will complicate things because the present
> clock_gettime code uses the vDSO on some architectures.

Yes, correct. There are vdso's(__vdso_clock_gettime and
__vdso_gettimeofday) for x64, arm, sparc to name a few.

>  Is
> clock_gettime64 available in the vDSO in 5.1 on all architectures
> where clock_gettime is? 

As fair as I can tell clock_gettime64 is not available as vdso in v5.1
Linux kernel. It is only exported to be used as a syscall.

The other issue with current vdso code (as in [1] or [2]) is that the
struct timespec's tv_sec is 'long', which would be 32 bit on 32 bit
systems (like arm).

> If it is, with the same symbol version as
> used for existing vDSO symbols, or a different symbol version?  If
> not in the vDSO, are there any performance implications from using a
> clock_gettime64 syscall in place of a clock_gettime call to the vDSO?
> 

This would need to be checked how severe is the performance regression
when one uses clock_gettime64 instead of dedicated __vdso_clock_gettime
(aliased to clock_gettime()).

It is also up to the Linux kernel community to decide if it is
acceptable to introduce vclock_gettime64.c file, which would provide
vdso for clock_gettime64.

Or maybe just convert [1], [2] to use struct timespec64 instead?

> (I think the code using the vDSO will automatically fall back to a 
> corresponding syscall if the vDSO symbol isn't there, but answers to
> those questions will still be relevant for reviewing any patch for 
> clock_gettime64 and understanding exactly what code paths it will
> use.)

Yes, correct as in [1], [2].

However, I do believe that on the beginning glibc shall support only
the syscall version of clock_gettime64 and make the switch for vdso
only when it is available from the Kernel.

> 


Note:

[1] - arch/x86/entry/vdso/vclock_gettime.c (5.1)
[2] - arch/arm/vdso/vgettimeofday.c

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Arnd Bergmann May 16, 2019, 3:57 a.m. UTC | #25
On Wed, May 15, 2019 at 5:39 PM Lukasz Majewski <lukma@denx.de> wrote:
> > On Wed, 8 May 2019, Lukasz Majewski wrote:
> > If it is, with the same symbol version as
> > used for existing vDSO symbols, or a different symbol version?  If
> > not in the vDSO, are there any performance implications from using a
> > clock_gettime64 syscall in place of a clock_gettime call to the vDSO?
> >
>
> This would need to be checked how severe is the performance regression
> when one uses clock_gettime64 instead of dedicated __vdso_clock_gettime
> (aliased to clock_gettime()).
>
> It is also up to the Linux kernel community to decide if it is
> acceptable to introduce vclock_gettime64.c file, which would provide
> vdso for clock_gettime64.

The vdso code is undergoing a rewrite to use the same implementation
across all architectures, once that is complete, the added clock_gettime64
vdso support is added trivially.

We decided not to add the calls earlier since that would clash with the
rewrite, and not be easily testable across architectures.

> > (I think the code using the vDSO will automatically fall back to a
> > corresponding syscall if the vDSO symbol isn't there, but answers to
> > those questions will still be relevant for reviewing any patch for
> > clock_gettime64 and understanding exactly what code paths it will
> > use.)
>
> Yes, correct as in [1], [2].
>
> However, I do believe that on the beginning glibc shall support only
> the syscall version of clock_gettime64 and make the switch for vdso
> only when it is available from the Kernel.

I think you need to do that anyway, since not all 32-bit architectures
support a vdso at all. As I understood, all vdso support is optional,
so there is always a runtime fallback to deal with kernels lack
the vdso for some reason.

       Arnd
  
Arnd Bergmann May 16, 2019, 4 a.m. UTC | #26
On Wed, May 8, 2019 at 9:48 PM Stepan Golosunov <stepan@golosunov.pp.ru> wrote:
> 08.05.2019 в 09:51:23 +0200 Lukasz Majewski написал:

> You need to handle the following 4 cases:
>
> 1.  32-bit syscall should not be called (and may not even exist):
> 1.1.  __NR_clock_settime is 64 bit.
> 1.2.  __NR_clock_settime64 works in all supported kernels.
>
> 2.  __NR_clock_settime is 32-bit and __NR_clock_settime64 may be
> unavailable.
> 2.1.  __NR_clock_settime64 is available in kernel headers, but may
> fail with ENOSYS when called.  32-bit __NR_clock_settime should be
> called if __NR_clock_settime64 is failing.
> 2.2.  __NR_clock_settime64 is not present in kernel headers.
> 32-bit __NR_clock_settime should be called.

Is there still a chance to revisit that last case? I had assumed that
glibc would have to require new kernel headers to support 64-bit
time_t since some of the macro definitions (ioctl commands,
socket options, ...) in the kernel have changed to deal with both
cases.

Building an application with 64-bit time_t using older headers is
likely to cause random problems.

       Arnd
  
Lukasz Majewski May 16, 2019, 6:58 a.m. UTC | #27
Hi Arnd,

> On Wed, May 15, 2019 at 5:39 PM Lukasz Majewski <lukma@denx.de> wrote:
> > > On Wed, 8 May 2019, Lukasz Majewski wrote:
> > > If it is, with the same symbol version as
> > > used for existing vDSO symbols, or a different symbol version?  If
> > > not in the vDSO, are there any performance implications from
> > > using a clock_gettime64 syscall in place of a clock_gettime call
> > > to the vDSO? 
> >
> > This would need to be checked how severe is the performance
> > regression when one uses clock_gettime64 instead of dedicated
> > __vdso_clock_gettime (aliased to clock_gettime()).
> >
> > It is also up to the Linux kernel community to decide if it is
> > acceptable to introduce vclock_gettime64.c file, which would provide
> > vdso for clock_gettime64.  
> 
> The vdso code is undergoing a rewrite to use the same implementation
> across all architectures, once that is complete, the added
> clock_gettime64 vdso support is added trivially.
> 
> We decided not to add the calls earlier since that would clash with
> the rewrite, and not be easily testable across architectures.

Thanks for sharing the update.

> 
> > > (I think the code using the vDSO will automatically fall back to a
> > > corresponding syscall if the vDSO symbol isn't there, but answers
> > > to those questions will still be relevant for reviewing any patch
> > > for clock_gettime64 and understanding exactly what code paths it
> > > will use.)  
> >
> > Yes, correct as in [1], [2].
> >
> > However, I do believe that on the beginning glibc shall support only
> > the syscall version of clock_gettime64 and make the switch for vdso
> > only when it is available from the Kernel.  
> 
> I think you need to do that anyway, since not all 32-bit architectures
> support a vdso at all. As I understood, all vdso support is optional,
> so there is always a runtime fallback to deal with kernels lack
> the vdso for some reason.

Yes, correct - there is always a fallback when vsdo is not available.

> 
>        Arnd




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Stepan Golosunov May 16, 2019, 7:54 p.m. UTC | #28
16.05.2019 в 06:00:56 +0200 Arnd Bergmann написал:
> On Wed, May 8, 2019 at 9:48 PM Stepan Golosunov <stepan@golosunov.pp.ru> wrote:
> > 08.05.2019 в 09:51:23 +0200 Lukasz Majewski написал:
> 
> > You need to handle the following 4 cases:
> >
> > 1.  32-bit syscall should not be called (and may not even exist):
> > 1.1.  __NR_clock_settime is 64 bit.
> > 1.2.  __NR_clock_settime64 works in all supported kernels.
> >
> > 2.  __NR_clock_settime is 32-bit and __NR_clock_settime64 may be
> > unavailable.
> > 2.1.  __NR_clock_settime64 is available in kernel headers, but may
> > fail with ENOSYS when called.  32-bit __NR_clock_settime should be
> > called if __NR_clock_settime64 is failing.
> > 2.2.  __NR_clock_settime64 is not present in kernel headers.
> > 32-bit __NR_clock_settime should be called.
> 
> Is there still a chance to revisit that last case? I had assumed that
> glibc would have to require new kernel headers to support 64-bit
> time_t since some of the macro definitions (ioctl commands,
> socket options, ...) in the kernel have changed to deal with both
> cases.

This is about building glibc itself, not applications that use it.
And time_t would be 32-bit in this case anyway.

> Building an application with 64-bit time_t using older headers is
> likely to cause random problems.

Then it might be a good idea to check for recent enough kernel headers
when building programs that define _TIME_BITS to 64.  (Though if those
programs use above-mentioned ioctls and socket options they will
likely need new enough kernel at runtime too.)
  
Joseph Myers May 16, 2019, 7:59 p.m. UTC | #29
On Thu, 16 May 2019, Arnd Bergmann wrote:

> > 2.2.  __NR_clock_settime64 is not present in kernel headers.
> > 32-bit __NR_clock_settime should be called.
> 
> Is there still a chance to revisit that last case? I had assumed that
> glibc would have to require new kernel headers to support 64-bit
> time_t since some of the macro definitions (ioctl commands,
> socket options, ...) in the kernel have changed to deal with both
> cases.

In general, we're wary of being too aggressive about requirements for 
building glibc (although the minimum kernel header version for building 
glibc may still be more recent than the minimum kernel version required by 
glibc at runtime).  If anyone building glibc on a distribution from the 
past couple of years does not need to install extra build requirements 
locally, that's convenient.

In this case, there is *no* released kernel with those __NR_* in its 
headers that is good for glibc testing, because of the fds_bits / val 
namespace issues in 5.1 (these sorts of changes are an example of exactly 
the kind of change for which the conform/ tests are most important because 
of the risk of accidentally introducing namespace issues, whether 
link-time or in the headers).  And even once some suitable kernel exists, 
it would be a while before it's ubiquitous for people building glibc.

I'd expect the only effects of allowing for not having __NR_* would 
generally be that the case where the syscalls are not assumed to be 
available at runtime also has an #ifdef on the __NR_* macro, so that if 
it's not defined it falls straight through into exactly the same fallback 
code as if the syscall had failed with ENOSYS.
  
Florian Weimer May 17, 2019, 7:23 a.m. UTC | #30
* Joseph Myers:

> I'd expect the only effects of allowing for not having __NR_* would 
> generally be that the case where the syscalls are not assumed to be 
> available at runtime also has an #ifdef on the __NR_* macro, so that if 
> it's not defined it falls straight through into exactly the same fallback 
> code as if the syscall had failed with ENOSYS.

An alternative would be to ship all the system call constants in the
glibc source tree (and check that they match the installed kernel header
sources if available there).

Thanks,
Florian
  
Arnd Bergmann May 17, 2019, 8:34 a.m. UTC | #31
On Fri, May 17, 2019 at 9:23 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Joseph Myers:
>
> > I'd expect the only effects of allowing for not having __NR_* would
> > generally be that the case where the syscalls are not assumed to be
> > available at runtime also has an #ifdef on the __NR_* macro, so that if
> > it's not defined it falls straight through into exactly the same fallback
> > code as if the syscall had failed with ENOSYS.
>
> An alternative would be to ship all the system call constants in the
> glibc source tree (and check that they match the installed kernel header
> sources if available there).

To clarify: I was not asking about the __NR_* constants, as those don't
ever change, and using older headers will not cause incorrect behavior
but just invoke the fallback.

The two classes of problems I'm interested in are:

- constants that depend on 'sizeof(time_t)', e.g. SO_TIMESTAMP
- structures that used time_t instead of a fixed-length type in older
  headers, e.g. 'struct input_event'

We have a number of those, and in each case, using the old
kernel headers will lead to an ABI mismatch between the application
and the kernel.

       Arnd
  
Stepan Golosunov May 31, 2019, 11:37 a.m. UTC | #32
08.05.2019 в 12:18:40 +0200 Lukasz Majewski написал:
> Hi Joseph,
> 
> > On Mon, 6 May 2019, Lukasz Majewski wrote:
> > 
> > > > So, which is (or will be) the case in 5.1 release?  Padding
> > > > ignored or not?  
> > > 
> > > As confirmed in the other mail - the padding is ignored in Linux
> > > kernel (and the fix patch for x32 is up its way to be pulled).  
> > 
> > Did the patch to ignore padding (for compat syscalls under 64-bit
> > kernels, non-x32) make it into the final 5.1 release?
> 
> As fair as I can tell, it was not pulled to 5.1.

The patch went into 5.1.5 and 5.2-rc1.

So the question now is:

Should Linux 5.1.0–5.1.4 be considered buggy and unsupported, or
should glibc clear padding around tv_nsec on 32-bit architectures when
__LINUX_KERNEL_VERSION < 0x050105 and 64-bit kernel exists?
  
Lukasz Majewski June 5, 2019, 4:35 p.m. UTC | #33
Hi Stepan,

> 08.05.2019 в 12:18:40 +0200 Lukasz Majewski написал:
> > Hi Joseph,
> >   
> > > On Mon, 6 May 2019, Lukasz Majewski wrote:
> > >   
> > > > > So, which is (or will be) the case in 5.1 release?  Padding
> > > > > ignored or not?    
> > > > 
> > > > As confirmed in the other mail - the padding is ignored in Linux
> > > > kernel (and the fix patch for x32 is up its way to be
> > > > pulled).    
> > > 
> > > Did the patch to ignore padding (for compat syscalls under 64-bit
> > > kernels, non-x32) make it into the final 5.1 release?  
> > 
> > As fair as I can tell, it was not pulled to 5.1.  
> 
> The patch went into 5.1.5 and 5.2-rc1.
> 
> So the question now is:
> 
> Should Linux 5.1.0–5.1.4 be considered buggy and unsupported, or
> should glibc clear padding around tv_nsec on 32-bit architectures when
> __LINUX_KERNEL_VERSION < 0x050105 and 64-bit kernel exists?

I would assume that the kernel is buggy for 5.1.0–5.1.4 on x32 (I
don't know what would be the impact of such decision - to be more
specific how many x32 ABI users would be affected).

Moreover, I would add the condition __LINUX_KERNEL_VERSION >= 0x050105
when we define __ASSUME_TIME64_SYSCALLS.



Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
  
Stepan Golosunov June 6, 2019, 6:50 a.m. UTC | #34
05.06.2019 в 18:35:16 +0200 Lukasz Majewski написал:
> Hi Stepan,
> 
> > 08.05.2019 в 12:18:40 +0200 Lukasz Majewski написал:
> > > Hi Joseph,
> > >   
> > > > On Mon, 6 May 2019, Lukasz Majewski wrote:
> > > >   
> > > > > > So, which is (or will be) the case in 5.1 release?  Padding
> > > > > > ignored or not?    
> > > > > 
> > > > > As confirmed in the other mail - the padding is ignored in Linux
> > > > > kernel (and the fix patch for x32 is up its way to be
> > > > > pulled).    
> > > > 
> > > > Did the patch to ignore padding (for compat syscalls under 64-bit
> > > > kernels, non-x32) make it into the final 5.1 release?  
> > > 
> > > As fair as I can tell, it was not pulled to 5.1.  
> > 
> > The patch went into 5.1.5 and 5.2-rc1.
> > 
> > So the question now is:
> > 
> > Should Linux 5.1.0–5.1.4 be considered buggy and unsupported, or
> > should glibc clear padding around tv_nsec on 32-bit architectures when
> > __LINUX_KERNEL_VERSION < 0x050105 and 64-bit kernel exists?
> 
> I would assume that the kernel is buggy for 5.1.0–5.1.4 on x32 (I
> don't know what would be the impact of such decision - to be more
> specific how many x32 ABI users would be affected).

x32 is irrelevant for this bug (as long as it has 64-bit tv_nsec in
glibc).  All other 32-bit ABIs which can be used on 64-bit kernels are
affected.

> Moreover, I would add the condition __LINUX_KERNEL_VERSION >= 0x050105
> when we define __ASSUME_TIME64_SYSCALLS.

This won't help as _time64 syscalls should be called even when
__ASSUME_TIME64_SYSCALLS is not defined.
  

Patch

diff --git a/sysdeps/unix/sysv/linux/kernel-features.h b/sysdeps/unix/sysv/linux/kernel-features.h
index bc5c959f58..2dbe5ada4c 100644
--- a/sysdeps/unix/sysv/linux/kernel-features.h
+++ b/sysdeps/unix/sysv/linux/kernel-features.h
@@ -143,3 +143,17 @@ 
    */
 
 #define __ASSUME_CLONE_DEFAULT 1
+
+/* Support for 64 bit version of clock_* Linux syscalls.
+
+   Support for following time related (and Y2038 safe) syscalls has been added
+   in the 5.1 Linux kernel:
+
+   clock_gettime64 (nr. 403)
+   clock_settime64 (nr. 404)
+   clock_getres_time64 (nr. 406)
+   clock_nanosleep_time64 (nr. 407)
+  */
+#if __LINUX_KERNEL_VERSION >= 0x050100
+# define __ASSUME_64BIT_TIME 1
+#endif