[1/1] Y2038: add function __difftime64

Message ID 9b02ad78-15cf-969e-b2ef-cae13d5a689e@cs.ucla.edu
State New, archived
Headers

Commit Message

Paul Eggert June 20, 2018, 7:29 p.m. UTC
  On 06/20/2018 05:14 AM, Albert ARIBAUD (3ADEV) wrote:
> This change batch creates a __difftime64 compatible with 64-bit time,
> and makes difftime either an alias of it or a 32-bit wrapper around it.

Since no glibc code calls difftime, can we assume that a later patch 
will change will make __difftime64 and difftime both available to user 
code? I'm not getting the big picture here.

Which public development repository and branch are you using? I see lots 
of glibc branches at sourceware.org named origin/aaribaud/y2038* but 
none of them seem to correspond that the patches you're sending. If I 
could see the current state of the entire set of patches you've 
developed, that would save time reviewing.

>     We could make it return a long double

It's not just that existing programs expect difftime to return 'double'; 
it's also that C11 and POSIX both require it to return 'double'.

> 2. The 64-bit time implementation was obtained by duplicating the
>     original 32-bit code then simplifying the source code based on
>     the knowledge that __time64_t is a 64-bit signed integer

This sort of simplification won't be possible in Gnulib, where 
__time64_t will be an alias of time_t and therefore could be an unsigned 
type. So let's not do that simplification. (Gnulib-using programs will 
always ask for 64-bit time_t if that is an option and 32-bit is the 
default, as the 32-bit default is silly if you have source code.)

It is OK to simplify difftime.c based on the assumption that time_t is 
an integer type. Glibc difftime.c was written back when POSIX allowed 
time_t to be floating-point, and some ancient implementations did that. 
This was widely regarded to be a mistake, POSIX no longer allows 
floating-point time_t and we don't need to worry about those old 
implementations. However, this simplification should be done as a 
separate patch (see first attachment).

>     - in the difftime64 function, removal of code which handled
>       time bitsize smaller than or equal to that of a double matissa
>       (as __time64_t has more bits than a double mantissa can hold)

This simplification is not possible in Gnulib either, as it's not 
portable to assume that time_t has more bits than a double fraction can 
hold. (They're fractions, not mantissas, by the way; mantissas are for 
logarithms not for floating-point.)

> -static double
> -subtract (time_t time1, time_t time0)
> +static double subtract (__time64_t time1, __time64_t time0)

Don't change indentation in cases like this; just stick to the glibc style.

>   {
> -  if (! TYPE_SIGNED (time_t))
> -    return time1 - time0;

Let's leave that TYPE_SIGNED test in, for the benefit of non-glibc uses 
where time_t is unsigned.

> -  if (TYPE_BITS (time_t) <= DBL_MANT_DIG
> -      || (TYPE_FLOATING (time_t) && sizeof (time_t) < sizeof (long double)))
> -    return (double) time1 - (double) time0;

Likewise, this needs to stay in, for portability.

The patch needs a better commit message, as the commit message doesn't 
say what exactly changed and has too much unnecessary talk about things 
we aren't doing. Commit messages should focus on what actually changed; 
the meat of any comments explaining the code should be in the code.

Attached is a proposed pair of (untested) patches that should reflect 
the above comments. What do you think?

PS. The mktime patches you sent have more problems like this. But that's 
a bigger nut to crack, and let's get difftime done first.
  

Comments

Albert ARIBAUD June 20, 2018, 8:55 p.m. UTC | #1
Hi Paul,

On Wed, 20 Jun 2018 12:29:06 -0700, Paul Eggert <eggert@cs.ucla.edu>
wrote :

> On 06/20/2018 05:14 AM, Albert ARIBAUD (3ADEV) wrote:
> > This change batch creates a __difftime64 compatible with 64-bit time,
> > and makes difftime either an alias of it or a 32-bit wrapper around it.  
> 
> Since no glibc code calls difftime, can we assume that a later patch 
> will change will make __difftime64 and difftime both available to user 
> code? I'm not getting the big picture here.

Actually, this is not an assumption; this is explicitly stated in the
whole patch series posted as an RFC previously, twice (original and v2),
with a detailed cover letter explaining how the series was structured.
See e.g.

https://patchwork.ozlabs.org/cover/811194/ (original RFC series)
https://patchwork.ozlabs.org/patch/900333/ (second iteration)

These series were cut into many individual patches introducing new
internal (or at least private) type and functions plus a final patch to
update the public API.

> Which public development repository and branch are you using? I see lots 
> of glibc branches at sourceware.org named origin/aaribaud/y2038* but 
> none of them seem to correspond that the patches you're sending. If I 
> could see the current state of the entire set of patches you've 
> developed, that would save time reviewing.

The entire patch sets were available at RFC time as branches on the
sourceware repository.

Based on the relatively low amount of comments received on the second
RFC series, I thought I could post the series for good, with a fair
chance that any additional change request would be cosmetic, so I
packed related changes together to reduce the amount of patch sets and
went ahead.

What actually happened is, I am getting quite more feedback now than I
got with the RFCs.

This implies I am rewriting the series in near real-time, which means
the branches are ever-evolving now.

I'll provide two branches for reviewers as follows:

- the currently posted patches will be under aaribaud/y2038-posted.
  I will rebase this branch whenever I post a new series or a new
  version of an already posted series, or rebase above a more recent
  origin/master. This branch will undergo frequent tests and runs of
  build-many-glibcs.py

- the complete patch set, including the public API patch, will be
  under aaribaud/y2038-complete. I will keep int rebased above the
  aaribaud/y2038-posted branch, but it will not be tested as often, and
  may quite possibly not build at all at times.

Other branches under aaribaud/ may change at any time and be in any
state (for instance I use the -wip one to pass changes between my dev
machine and the one which runs the build-many-glibcs.py script).

> > 2. The 64-bit time implementation was obtained by duplicating the
> >     original 32-bit code then simplifying the source code based on
> >     the knowledge that __time64_t is a 64-bit signed integer  
> 
> This sort of simplification won't be possible in Gnulib, where 
> __time64_t will be an alias of time_t and therefore could be an unsigned 
> type. So let's not do that simplification. (Gnulib-using programs will 
> always ask for 64-bit time_t if that is an option and 32-bit is the 
> default, as the 32-bit default is silly if you have source code.)

We do currently have unsigned 32-bit time_t implementations, but do we
have unsigned 64-bit time_t? A signed 64-bit integer __time64_t was
a core assumption as early as the first of the seven revisions
of https://sourceware.org/glibc/wiki/Y2038ProofnessDesign; it's a pity
the existence of unsigned 64-bit time_t was never raised.

> It is OK to simplify difftime.c based on the assumption that time_t is 
> an integer type. Glibc difftime.c was written back when POSIX allowed 
> time_t to be floating-point, and some ancient implementations did that. 
> This was widely regarded to be a mistake, POSIX no longer allows 
> floating-point time_t and we don't need to worry about those old 
> implementations. However, this simplification should be done as a 
> separate patch (see first attachment).
>
> >     - in the difftime64 function, removal of code which handled
> >       time bitsize smaller than or equal to that of a double matissa
> >       (as __time64_t has more bits than a double mantissa can hold)  
> 
> This simplification is not possible in Gnulib either, as it's not 
> portable to assume that time_t has more bits than a double fraction can 
> hold. (They're fractions, not mantissas, by the way; mantissas are for 
> logarithms not for floating-point.)
>
> The patch needs a better commit message, as the commit message doesn't 
> say what exactly changed and has too much unnecessary talk about things 
> we aren't doing. Commit messages should focus on what actually changed; 
> the meat of any comments explaining the code should be in the code.
> 
> Attached is a proposed pair of (untested) patches that should reflect 
> the above comments. What do you think?
>
> PS. The mktime patches you sent have more problems like this. But that's 
> a bigger nut to crack, and let's get difftime done first.

If the proposed patches are OK, then the best is to apply them to
master and I will remove the corresponding ones from the Y2038 series
(or I can put them in the series with you as the commit author, although
I am worried that if I have to edit them I might accidentally reset the
authorship to me.

Cordialement,
Albert ARIBAUD
3ADEV
  
Paul Eggert June 21, 2018, 9:17 p.m. UTC | #2
On 06/20/2018 01:55 PM, Albert ARIBAUD wrote:
>
>>> This change batch creates a __difftime64 compatible with 64-bit time,
>>> and makes difftime either an alias of it or a 32-bit wrapper around it.
>> Since no glibc code calls difftime, can we assume that a later patch
>> will change will make __difftime64 and difftime both available to user
>> code? I'm not getting the big picture here.
> Actually, this is not an assumption

In that case, I'm still puzzled about the specific example of difftime. 
Are you assuming that glibc can export just one difftime function to 
32-bit user code, and that because every 32-bit time_t value fits into 
__time64_t this single function will work with both time_t and 
__time64_t? That assumption fails in many platforms (as they use 
different calling conventions for 32-bit versus 64-bit integers), so in 
general a glibc implementation on a platform that currently has 32-bit 
time_t will need to export two difftime entry points.

> What actually happened is, I am getting quite more feedback now than I
> got with the RFCs.

That's a good sign. You're getting feedback now. Many patches languish 
nearly forever because nobody has the time to review them, unfortunately.

> I'll provide two branches for reviewers

Thanks, please let us know when they're ready. That will help me answer 
questions like the difftime question above.

> We do currently have unsigned 32-bit time_t implementations, but do we
> have unsigned 64-bit time_t? A signed 64-bit integer __time64_t was
> a core assumption

That can be a core assumption of glibc, but for code shared with Gnulib 
we'd rather not make assumptions like that. It's little trouble to 
support unsigned 64-bit (or even wider) time_t in Gnulib code, and we 
should continue to support this. All we should need to assume is that 
every time_t value fits into __time64_t. Come to think of it, __time64_t 
will be a misnomer in Gnulib since __time64_t is not necessarily a 
64-bit type, so perhaps code shared with Gnulib should continue to use 
the name "internal_time_t" (for internal use only, of course) to 
underscore the fact that it might not be 64 bits.

> If the proposed patches are OK, then the best is to apply them to
> master and I will remove the corresponding ones from the Y2038 series
>

I don't think we're ready to start installing these into master, since I 
still don't fully understand how difftime (much less mktime) will work. 
Also, I want to make sure that the patches will be compatible with 
Gnulib. It may make more sense to change Gnulib first, as it is less 
formal about this sort of update.
  
Albert ARIBAUD June 25, 2018, 10:32 p.m. UTC | #3
Hi Paul,

On Thu, 21 Jun 2018 14:17:24 -0700, Paul Eggert <eggert@cs.ucla.edu>
wrote :

> On 06/20/2018 01:55 PM, Albert ARIBAUD wrote:
> >  
>  [...]  
> >> Since no glibc code calls difftime, can we assume that a later patch
> >> will change will make __difftime64 and difftime both available to user
> >> code? I'm not getting the big picture here.  
> > Actually, this is not an assumption  
> 
> In that case, I'm still puzzled about the specific example of difftime. 
> Are you assuming that glibc can export just one difftime function to 
> 32-bit user code, and that because every 32-bit time_t value fits into 
> __time64_t this single function will work with both time_t and 
> __time64_t? That assumption fails in many platforms (as they use 
> different calling conventions for 32-bit versus 64-bit integers), so in 
> general a glibc implementation on a platform that currently has 32-bit 
> time_t will need to export two difftime entry points.

I am indeed assuming that every 32-bit time_t value will fit into a
64-bit __time64_t, but I don't infer from this that 64-bit as well as
32-bit time user code could use a single function as their difftime.
Actually, if you'll remember my first rounds of RFC patches (which you
did comment), I had initially elected to provide a 64-bit difftime
alongside the original difftime; and later, since wrappers seemed to be
the preference, I replaced the original 32-bit difftime with a wrapper
around the 64-bit time; and in both cases, there were two functions, on
for each time size.

> > What actually happened is, I am getting quite more feedback now than I
> > got with the RFCs.  
> 
> That's a good sign. You're getting feedback now. Many patches languish 
> nearly forever because nobody has the time to review them, unfortunately.
>
> > I'll provide two branches for reviewers  
> 
> Thanks, please let us know when they're ready. That will help me answer 
> questions like the difftime question above.
>
> > We do currently have unsigned 32-bit time_t implementations, but do we
> > have unsigned 64-bit time_t? A signed 64-bit integer __time64_t was
> > a core assumption  
> 
> That can be a core assumption of glibc, but for code shared with Gnulib 
> we'd rather not make assumptions like that. It's little trouble to 
> support unsigned 64-bit (or even wider) time_t in Gnulib code, and we 
> should continue to support this. All we should need to assume is that 
> every time_t value fits into __time64_t. Come to think of it, __time64_t 
> will be a misnomer in Gnulib since __time64_t is not necessarily a 
> 64-bit type, so perhaps code shared with Gnulib should continue to use 
> the name "internal_time_t" (for internal use only, of course) to 
> underscore the fact that it might not be 64 bits.
> 
> > If the proposed patches are OK, then the best is to apply them to
> > master and I will remove the corresponding ones from the Y2038 series
> >  
> 
> I don't think we're ready to start installing these into master, since I 
> still don't fully understand how difftime (much less mktime) will work. 
> Also, I want to make sure that the patches will be compatible with 
> Gnulib. It may make more sense to change Gnulib first, as it is less 
> formal about this sort of update.

Since you want the changes in gnulib first, then I suspect I should
provide branches above gnulib as well as above glibc? If so, what
would you recommend as a good source on setting up a build and test
setup for gnulib, similar to build-many-glibcs.py is for glibc?

Cordialement,
Albert ARIBAUD
3ADEV
  
Paul Eggert June 25, 2018, 11:56 p.m. UTC | #4
On 06/25/2018 03:32 PM, Albert ARIBAUD wrote:
> I replaced the original 32-bit difftime with a wrapper
> around the 64-bit time; and in both cases, there were two functions, on
> for each time size.

Yes, and all this is in the patch you posted in 
<https://www.sourceware.org/ml/libc-alpha/2018-06/msg00605.html>. What 
I'm not understanding is how the functions it defines (__difftime64 and 
__difftime on 32-bit hosts, and just __difftime on 64-bit hosts) 
interoperate the with user-defined macro (is it still _TIME_BITS? I 
can't recall) that specifies whether the user wants time_t is 64 or 32 
bits, and how they interoperate with the macro that Joseph Myers 
suggested was needed by the system to say whether a 32-bit time_t 
variant is supported.

In short, I'm still missing the bigger picture here.

> Since you want the changes in gnulib first, then I suspect I should
> provide branches above gnulib as well as above glibc? If so, what
> would you recommend as a good source on setting up a build and test
> setup for gnulib, similar to build-many-glibcs.py is for glibc?
>

Yes, it would be helpful to have this in Gnulib too. The basic idea is 
that Gnulib and glibc sources should be as close to each other as 
possible; preferably identical. You can build and test a Gnulib module 
by running './gnulib-tool --test modulename'; run './gnulib-tool --help' 
for more.
  
Albert ARIBAUD June 27, 2018, 11:03 a.m. UTC | #5
Hi Paul,

On Mon, 25 Jun 2018 16:56:14 -0700, Paul Eggert <eggert@cs.ucla.edu>
wrote :

> On 06/25/2018 03:32 PM, Albert ARIBAUD wrote:
> > I replaced the original 32-bit difftime with a wrapper
> > around the 64-bit time; and in both cases, there were two functions, on
> > for each time size.  
> 
> Yes, and all this is in the patch you posted in 
> <https://www.sourceware.org/ml/libc-alpha/2018-06/msg00605.html>. What 
> I'm not understanding is how the functions it defines (__difftime64 and 
> __difftime on 32-bit hosts, and just __difftime on 64-bit hosts) 
> interoperate the with user-defined macro (is it still _TIME_BITS? I 
> can't recall) that specifies whether the user wants time_t is 64 or 32 
> bits, and how they interoperate with the macro that Joseph Myers 
> suggested was needed by the system to say whether a 32-bit time_t 
> variant is supported.
> 
> In short, I'm still missing the bigger picture here.

OK, so let me try to sum things up (I'll update the design document
accordingly).

Right now, on 32-bit architectures, application source code which uses
glibc and manipulates time can only go as far as Y2038, because the
public API declares only 32-bit types (time_t and related structures)
and functions (difftime etc). Any time beyond Y2038 is undefined
behavior  -- I've documented behaviors ranging from just time rollover
(back to Dec. 1901 IIRC) to full system freeze.

My goal is:

- to make glibc provide, when _TIME_BITS is is defined equal to 64, the
  same set of type and function names but with 64-bit time support (and
  with _TIME_BITS undefined or defined equal to 32, glibc would stick
  with the old types and functions);

- to ensure that, at least for now, the existing ABI remain unchanged,
  i.e. application modules resulting from compiling against current
  glibc should link against a 64-bit-time-capable glibc and run
  without errors.

In other words, source code referring to difftime would result in
object code referring either to the existing __difftime symbol if
_TIME_BITS is not equal to 64, or to some new __difftime64 symbol if
_TIME_BITS is equal to 64; and the symbol __difftime in glibc should
always represent the existing 32-bit implementation so that existing,
already compiled, client code would execute the difftime they expect
to.

How to achieve this:

- since a single glibc header install must support compiling both
  32-bit-time and new 64-bit-time object code, glibc must provide two
  sets of public API symbol definitions, one for 32-bit-time, exposed
  if _TIME_BITS is defined equal to 32 or undefined, and one for 64-bit
  time, exposed if _TIME_BITS is defined equal to 64.

- since a single glibc library must support both existing 32-bit-time
  and new 64-bit-time object code, glibc must therefore provide two
  sets of implementation symbols at the same time: the current set for
  object client code which was compiled for 32-bit time, and a new set
  for object client code which was compiled for64-bit time.

Refining the how:

- we cannot introduce the public API and _TIME_BITS support in small
  batches, because if you switch time_t to 64-bit, that switches all
  functions using time_t to 64-bit too, and all types using time_t as
  well. Basically, _TIME_BITS and the switch of the API to 64-bit time
  has to be a single, atomic, patch.

- but we can introduce the new 64-bit symbols, including the
  publically visible ones, even though we won't expose them as they
  will be in the end. These new symbols do not need to be all introduced
  in a single patch; this can be done progressively.

For instance, in the end, we will have to map the public API's difftime
symbol either to the existing 32-bit __difftime or to a new 64-bit
__difftime64, depending on _TIME_BITS. So we will /have/ to provide a
public symbol __difftime64. Only in the final _TIME_BITS patch will
this symbol be aliased with 'difftime' -- for client code which compiles
with _TIME_BITS=64.

Further refining the how:

- We can create 64-bit implementations from existing 32-bit ones, with
  some hand-optimizing based on time_t assumptions which we know
  always / never apply to __time64_t (such as possibly being integer or
  being floating point) or size considerations (such as some precision
  loss risks which might not apply to __time64_t).

- We can/must rewrite some internals of glibc to handle __time64_t
  rather than time_t.

- Once there is a 64-bit implementation for a given functionality in
  32-bit architectures, then the 32-bit implementation can be turned
  into a wrapper around the 64-bit one (if the 32-bit implementation
  is even needed; for 64-bit architectures and x32, it does not need
  to be defined at all).

  This is where we need another macro, __TIME_SIZE. While _TIME_BITS
  tells us, at client code compile time, whether client code wants a
  64-bit time API or not, __TIMESIZE tells us, at glibc compile time,
  whether the time_t for the considered architecture is 64-bit or not.
  Architectures with 64-bit time_t do not need 32-bit wrappers, and can
  use time_t for their __time64_t; others need the 32-bit wrappers, and
  need separate time_t and __time64_t.

Final picture:

- the set of patches to support Y2038 will consist in new types and new
  functions (either visible from the public or purely internal to
  glibc) or internal glibc function conversions to 64 bit; these
  patches will have some of the implementation dependent on __TIMESIZE.
  Then in the end a single patch will expose the 'old' or 'new' set
  depending on _TIME_BITS. 

Is this clear enough? Don't hesitate to ask for further clarifications.

> > Since you want the changes in gnulib first, then I suspect I should
> > provide branches above gnulib as well as above glibc? If so, what
> > would you recommend as a good source on setting up a build and test
> > setup for gnulib, similar to build-many-glibcs.py is for glibc?
> 
> Yes, it would be helpful to have this in Gnulib too. The basic idea is 
> that Gnulib and glibc sources should be as close to each other as 
> possible; preferably identical. You can build and test a Gnulib module 
> by running './gnulib-tool --test modulename'; run './gnulib-tool --help' 
> for more.

Since gnulib is on Savannah, not Sourceware, I assume I will need to be
given some level of write access to the Savannah gnulib repository in
order to provide branches there too, similar to what is done in glibc.
Who should I ask this?

Cordialement,
Albert ARIBAUD
3ADEV
  
Albert ARIBAUD July 5, 2018, 6:36 p.m. UTC | #6
Hello all,

On Wed, 27 Jun 2018 13:03:42 +0200, Albert ARIBAUD
<albert.aribaud@3adev.fr> wrote :

> Hi Paul,
> 
> On Mon, 25 Jun 2018 16:56:14 -0700, Paul Eggert <eggert@cs.ucla.edu>
> wrote :
> 
> > On 06/25/2018 03:32 PM, Albert ARIBAUD wrote:  
> > > I replaced the original 32-bit difftime with a wrapper
> > > around the 64-bit time; and in both cases, there were two functions, on
> > > for each time size.    
> > 
> > Yes, and all this is in the patch you posted in 
> > <https://www.sourceware.org/ml/libc-alpha/2018-06/msg00605.html>. What 
> > I'm not understanding is how the functions it defines (__difftime64 and 
> > __difftime on 32-bit hosts, and just __difftime on 64-bit hosts) 
> > interoperate the with user-defined macro (is it still _TIME_BITS? I 
> > can't recall) that specifies whether the user wants time_t is 64 or 32 
> > bits, and how they interoperate with the macro that Joseph Myers 
> > suggested was needed by the system to say whether a 32-bit time_t 
> > variant is supported.
> > 
> > In short, I'm still missing the bigger picture here.  
> 
> OK, so let me try to sum things up (I'll update the design document
> accordingly).
> 
> Right now, on 32-bit architectures, application source code which uses
> glibc and manipulates time can only go as far as Y2038, because the
> public API declares only 32-bit types (time_t and related structures)
> and functions (difftime etc). Any time beyond Y2038 is undefined
> behavior  -- I've documented behaviors ranging from just time rollover
> (back to Dec. 1901 IIRC) to full system freeze.
> 
> My goal is:
> 
> - to make glibc provide, when _TIME_BITS is is defined equal to 64, the
>   same set of type and function names but with 64-bit time support (and
>   with _TIME_BITS undefined or defined equal to 32, glibc would stick
>   with the old types and functions);
> 
> - to ensure that, at least for now, the existing ABI remain unchanged,
>   i.e. application modules resulting from compiling against current
>   glibc should link against a 64-bit-time-capable glibc and run
>   without errors.
> 
> In other words, source code referring to difftime would result in
> object code referring either to the existing __difftime symbol if
> _TIME_BITS is not equal to 64, or to some new __difftime64 symbol if
> _TIME_BITS is equal to 64; and the symbol __difftime in glibc should
> always represent the existing 32-bit implementation so that existing,
> already compiled, client code would execute the difftime they expect
> to.
> 
> How to achieve this:
> 
> - since a single glibc header install must support compiling both
>   32-bit-time and new 64-bit-time object code, glibc must provide two
>   sets of public API symbol definitions, one for 32-bit-time, exposed
>   if _TIME_BITS is defined equal to 32 or undefined, and one for 64-bit
>   time, exposed if _TIME_BITS is defined equal to 64.
> 
> - since a single glibc library must support both existing 32-bit-time
>   and new 64-bit-time object code, glibc must therefore provide two
>   sets of implementation symbols at the same time: the current set for
>   object client code which was compiled for 32-bit time, and a new set
>   for object client code which was compiled for64-bit time.
> 
> Refining the how:
> 
> - we cannot introduce the public API and _TIME_BITS support in small
>   batches, because if you switch time_t to 64-bit, that switches all
>   functions using time_t to 64-bit too, and all types using time_t as
>   well. Basically, _TIME_BITS and the switch of the API to 64-bit time
>   has to be a single, atomic, patch.
> 
> - but we can introduce the new 64-bit symbols, including the
>   publically visible ones, even though we won't expose them as they
>   will be in the end. These new symbols do not need to be all introduced
>   in a single patch; this can be done progressively.
> 
> For instance, in the end, we will have to map the public API's difftime
> symbol either to the existing 32-bit __difftime or to a new 64-bit
> __difftime64, depending on _TIME_BITS. So we will /have/ to provide a
> public symbol __difftime64. Only in the final _TIME_BITS patch will
> this symbol be aliased with 'difftime' -- for client code which compiles
> with _TIME_BITS=64.
> 
> Further refining the how:
> 
> - We can create 64-bit implementations from existing 32-bit ones, with
>   some hand-optimizing based on time_t assumptions which we know
>   always / never apply to __time64_t (such as possibly being integer or
>   being floating point) or size considerations (such as some precision
>   loss risks which might not apply to __time64_t).
> 
> - We can/must rewrite some internals of glibc to handle __time64_t
>   rather than time_t.
> 
> - Once there is a 64-bit implementation for a given functionality in
>   32-bit architectures, then the 32-bit implementation can be turned
>   into a wrapper around the 64-bit one (if the 32-bit implementation
>   is even needed; for 64-bit architectures and x32, it does not need
>   to be defined at all).
> 
>   This is where we need another macro, __TIME_SIZE. While _TIME_BITS
>   tells us, at client code compile time, whether client code wants a
>   64-bit time API or not, __TIMESIZE tells us, at glibc compile time,
>   whether the time_t for the considered architecture is 64-bit or not.
>   Architectures with 64-bit time_t do not need 32-bit wrappers, and can
>   use time_t for their __time64_t; others need the 32-bit wrappers, and
>   need separate time_t and __time64_t.
> 
> Final picture:
> 
> - the set of patches to support Y2038 will consist in new types and new
>   functions (either visible from the public or purely internal to
>   glibc) or internal glibc function conversions to 64 bit; these
>   patches will have some of the implementation dependent on __TIMESIZE.
>   Then in the end a single patch will expose the 'old' or 'new' set
>   depending on _TIME_BITS. 
> 
> Is this clear enough? Don't hesitate to ask for further clarifications.
> 
> > > Since you want the changes in gnulib first, then I suspect I should
> > > provide branches above gnulib as well as above glibc? If so, what
> > > would you recommend as a good source on setting up a build and test
> > > setup for gnulib, similar to build-many-glibcs.py is for glibc?  
> > 
> > Yes, it would be helpful to have this in Gnulib too. The basic idea is 
> > that Gnulib and glibc sources should be as close to each other as 
> > possible; preferably identical. You can build and test a Gnulib module 
> > by running './gnulib-tool --test modulename'; run './gnulib-tool --help' 
> > for more.  
> 
> Since gnulib is on Savannah, not Sourceware, I assume I will need to be
> given some level of write access to the Savannah gnulib repository in
> order to provide branches there too, similar to what is done in glibc.
> Who should I ask this?

Cross-posting this to both the glibc and gnulib lists for now; if anyone
thinks this thread should stay on only gnulib, feel free.

I have had a look at gnulib in the meantime, and I would like to know
if the following assumptions are correct:

- gnulib does not contain any module which provides the time_t type, but
  some of gnulib's modules assume there is such a type, and it might be
  wider than 32 bits.

- gnulib contains a year2038 module which is only intended to check
  whether time_t is limited to Y2038 or not.

- gnulib does not provide difftime either, so ATM a difftime patch
  would only make sense in glibc, not gnulib.

- gnulib makes no assumption on how it is used in object form; notably,
  it makes no assumption that it will be compiled into a shared object
  form which will provide the same functionalities for both 64-bit and
  32-bit callers.

Are all these correct, and if not, could someone develop in what
respect they aren't?

Cordialement,
Albert ARIBAUD
3ADEV
  
Zack Weinberg July 5, 2018, 7:13 p.m. UTC | #7
On Thu, Jul 5, 2018 at 2:36 PM, Albert ARIBAUD <albert.aribaud@3adev.fr> wrote:
> I have had a look at gnulib in the meantime, and I would like to know
> if the following assumptions are correct:

I can't comment on anything else at all, but:

> - gnulib does not contain any module which provides the time_t type, but
>   some of gnulib's modules assume there is such a type, and it might be
>   wider than 32 bits.
>
> - gnulib does not provide difftime either, so ATM a difftime patch
>   would only make sense in glibc, not gnulib.

<time.h> has always been required to declare a type named time_t and a
function named difftime, ever since the original C standard in 1989.
So it makes sense for gnulib not to provide either, and to assume they
exist.

In the original C standard time_t is only required to be "an
arithmetic type capable of representing times".  That means it could
be any size of integer or floating-point.   This has not changed as of
C11 -- the term "arithmetic type" changed to "real type" because of
the addition of complex numbers, but that's all.  POSIX adds the
additional guarantee that time_t is an integer type, but still says
nothing about how wide it is.

zw
  
Paul Eggert July 5, 2018, 7:40 p.m. UTC | #8
Albert ARIBAUD wrote:
> I would like to know
> if the following assumptions are correct:
> 
> - gnulib contains a year2038 module which is only intended to check
>    whether time_t is limited to Y2038 or not.

Although true for now, in the long run year2038 could be changed to enable 
macros that will cause the implementation to use 64- instead of 32-bit time_t. 
This is a plausible thing to do once glibc has such a macro.

> - gnulib does not provide difftime either, so ATM a difftime patch
>    would only make sense in glibc, not gnulib.

Although Gnulib hasn't needed a difftime module yet, it might need one once this 
32- vs 64-bit time_t stuff lands into glibc, so let's keep difftime.c usable for 
Gnulib.

> - gnulib ... makes no assumption that it will be compiled into a shared object
>    form which will provide the same functionalities for both 64-bit and
>    32-bit callers.

Although that's generally true, Gnulib can be used in such shared objects by 
compiling it twice (once for each model), using different names for each entry 
point. I vaguely recall some people doing this sort of thing for 32- vs 64-bit 
file offsets, though I don't recommend the practice myself.
  
Albert ARIBAUD July 5, 2018, 8:38 p.m. UTC | #9
Hi Paul,

On Thu, 5 Jul 2018 12:40:07 -0700, Paul Eggert <eggert@cs.ucla.edu>
wrote :

> Albert ARIBAUD wrote:
> > I would like to know
> > if the following assumptions are correct:
> > 
> > - gnulib contains a year2038 module which is only intended to check
> >    whether time_t is limited to Y2038 or not.  
> 
> Although true for now, in the long run year2038 could be changed to enable 
> macros that will cause the implementation to use 64- instead of 32-bit time_t. 
> This is a plausible thing to do once glibc has such a macro.

Now I'm confused. I was under the impression that you wanted the
64-bit-time stuff to go in gnulib before it went in glibc, so I don't
get what the "once glibc has such a macro" means. Can you elaborate on
what you had in mind?

> > - gnulib does not provide difftime either, so ATM a difftime patch
> >    would only make sense in glibc, not gnulib.  
> 
> Although Gnulib hasn't needed a difftime module yet, it might need one once this 
> 32- vs 64-bit time_t stuff lands into glibc, so let's keep difftime.c usable for 
> Gnulib.

Ditto.

> > - gnulib ... makes no assumption that it will be compiled into a shared object
> >    form which will provide the same functionalities for both 64-bit and
> >    32-bit callers.  
> 
> Although that's generally true, Gnulib can be used in such shared objects by 
> compiling it twice (once for each model), using different names for each entry 
> point. I vaguely recall some people doing this sort of thing for 32- vs 64-bit 
> file offsets, though I don't recommend the practice myself.

OK, so taking mktime as an example since gnulib provides it, and
assuming a platform where time_t is not currently Y2038-proof, if I
wanted to have a Y2038-proof mktime implementation in addition to
the existing non-Y2038-proof one, I could compile mktime.c twice, once
with the native non-Y2038-proof time_t and one with a redefined
Y2038-proof time_t and with mktime #defined to another name to avoid
having the mktime symbol defined twice.

(but I don't see yet how this is going to help making glibc Y2038-proof.
As far as I understand, glibc does not use gnulib and I don't think its
build system ever compiles a single C source file into two object files
for two different word sizes anyway).

Cordialement,
Albert ARIBAUD
3ADEV
  
Bruno Haible July 5, 2018, 8:50 p.m. UTC | #10
Hi,

> I have had a look at gnulib in the meantime, and I would like to know
> if the following assumptions are correct:
> 
> - gnulib contains a year2038 module which is only intended to check
>   whether time_t is limited to Y2038 or not.

Yes and no.

Yes, gnulib contains a year2038 module [1][2].
No, it does not "check" something: It attempts to ensure that time_t is a
64-bit type, by defining certain preprocessor symbols.
No, it does not guarantee a configuration error or compilation error if
time_t is only 32-bit. It produces a configuration error ONLY if time_t
is only 32-bit AND this can be avoided by choosing a different ABI.

> - gnulib does not contain any module which provides the time_t type, but
>   some of gnulib's modules assume there is such a type, and it might be
>   wider than 32 bits.

Correct. All portability targets of gnulib do have a time_t type. Since
it is not possible to support a wider time_t type without kernel support,
gnulib does not attempt to redefine time_t type via typedef or #define.

> - gnulib does not provide difftime either, so ATM a difftime patch
>   would only make sense in glibc, not gnulib.

difftime is only the least problem. The bigger ones are gettimeofday,
stat, fstat. But if on some platform, difftime is the only missing piece
for supporting 64-bit time_t, gnulib could easily provide a replacement.

> - gnulib makes no assumption on how it is used in object form; notably,
>   it makes no assumption that it will be compiled into a shared object
>   form which will provide the same functionalities for both 64-bit and
>   32-bit callers.

Correct. Gnulib provides an "application source interface" only, not an
"application binary interface".

Bruno

[1] https://www.gnu.org/software/gnulib/manual/html_node/Avoiding-the-year-2038-problem.html
[2] https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob_plain;f=m4/year2038.m4
  
Bruno Haible July 5, 2018, 9:17 p.m. UTC | #11
Albert ARIBAUD wrote:
> I was under the impression that you wanted the
> 64-bit-time stuff to go in gnulib before it went in glibc, so I don't
> get what the "once glibc has such a macro" means. Can you elaborate on
> what you had in mind?

I can't speak for Paul, but for me the sequence of steps that produces the
desired result with the least effort would be:

  1) glibc implements the 'time_t' type that depends on the value _TIME_BITS
     defined at preprocessor level, like you described. Including support for
     'gettimeofday', 'stat', 'fstat' and the like.
     While doing so, pay attention that the implementation of mktime, strftime,
     strptime, etc. can be compiled with a 32-bit time_t or a 64-bit time_t.

  2) gnulib modifies its year2038 module to define _TIME_BITS to 64 at configure
     time, on platforms where glibc supports it.

  3) During the next source-code sync from glibc to gnulib, involving mktime.c
     etc., the gnulib people (likely Paul) make sure that this source code can
     still be used on non-glibc platforms with either 32-bit time_t or 64-bit
     time_t. Usually this involves a couple of #ifs and conditional #includes.
     These changes can then flow back into glibc. They won't have a functional
     effect in glibc, therefore won't break the atomicity of step 1.

AFAICS, steps 3 could also be executed before step 2.

Bruno
  
Albert ARIBAUD July 6, 2018, 5:05 a.m. UTC | #12
Hi Bruno,

Le Thu, 05 Jul 2018 23:17:26 +0200, Bruno Haible <bruno@clisp.org> a
écrit :

> Albert ARIBAUD wrote:
> > I was under the impression that you wanted the
> > 64-bit-time stuff to go in gnulib before it went in glibc, so I don't
> > get what the "once glibc has such a macro" means. Can you elaborate on
> > what you had in mind?  
> 
> I can't speak for Paul, but for me the sequence of steps that produces the
> desired result with the least effort would be:
> 
>   1) glibc implements the 'time_t' type that depends on the value _TIME_BITS
>      defined at preprocessor level, like you described. Including support for
>      'gettimeofday', 'stat', 'fstat' and the like.
>      While doing so, pay attention that the implementation of mktime, strftime,
>      strptime, etc. can be compiled with a 32-bit time_t or a 64-bit time_t.
> 
>   2) gnulib modifies its year2038 module to define _TIME_BITS to 64 at configure
>      time, on platforms where glibc supports it.
> 
>   3) During the next source-code sync from glibc to gnulib, involving mktime.c
>      etc., the gnulib people (likely Paul) make sure that this source code can
>      still be used on non-glibc platforms with either 32-bit time_t or 64-bit
>      time_t. Usually this involves a couple of #ifs and conditional #includes.
>      These changes can then flow back into glibc. They won't have a functional
>      effect in glibc, therefore won't break the atomicity of step 1.
> 
> AFAICS, steps 3 could also be executed before step 2.
> 
> Bruno

Thanks for the detailed answer(s, including your previous one).

Do code syncs from glibc to gnulib happen on new glibc releases? If
they do, then my chick-and-egg problem remains (unless glibc accepts
pulling in only my glibc patch which add __time_64_t without changing
the public API, but that would amount to dead code.

Cordialement,
Albert ARIBAUD
3ADEV
  
Paul Eggert July 6, 2018, 10:37 p.m. UTC | #13
Albert ARIBAUD wrote:

>> Although true for now, in the long run year2038 could be changed to enable
>> macros that will cause the implementation to use 64- instead of 32-bit time_t.
>> This is a plausible thing to do once glibc has such a macro.
> 
> Now I'm confused. I was under the impression that you wanted the
> 64-bit-time stuff to go in gnulib before it went in glibc, so I don't
> get what the "once glibc has such a macro" means. Can you elaborate on
> what you had in mind?

Gnulib is easier to change than glibc, so it makes sense to update it first.

Currently Gnulib's year2038 module doesn't enable any Glibc macros to prefer 64- 
to 32-bit time_t when both are available - since there aren't any - but when 
Glibc has one, we will change Gnulib to define the macro. We can make this 
change to Gnulib now, if the Glibc macro's name is stable (is it?). However, I 
was thinking of something more ambitious: having Gnulib support 64-bit time_t on 
32-bit glibc now, even before the Glibc support is in. In doing this, it would 
emulate the planned Glibc behavior by defining the Glibc macro's name (even 
though the underlying Glibc doesn't use the macro yet). As this is still a 
possibility, I would rather not make the assumption that Gnulib will never 
contain such a change.

That is partly why I don't want to hand-optimize difftime for the specific case 
of 64-bit time_t. We should continue to have a generic difftime that will work 
regardless of time_t width, as we may need it in Gnulib (even if we don't need 
it in this go-round).

> (but I don't see yet how this is going to help making glibc Y2038-proof.
> As far as I understand, glibc does not use gnulib and I don't think its
> build system ever compiles a single C source file into two object files
> for two different word sizes anyway).

As I said, doing all that is not something I recommend.

The goal here is not merely making glibc Y2038-proof; it is also keeping it 
glibc in sync with Gnulib when possible, while keeping Gnulib portable (which 
also includes keeping Gnulib Y2038-proof, on platforms with time_t that can go 
past Y2038).
  
Paul Eggert July 6, 2018, 10:40 p.m. UTC | #14
Bruno Haible wrote:

>    3) During the next source-code sync from glibc to gnulib, involving mktime.c
>       etc., the gnulib people (likely Paul) make sure that this source code can
>       still be used on non-glibc platforms with either 32-bit time_t or 64-bit
>       time_t. Usually this involves a couple of #ifs and conditional #includes.
>       These changes can then flow back into glibc. They won't have a functional
>       effect in glibc, therefore won't break the atomicity of step 1.

What I'm trying to do is to lessen the work in step (3) by reviewing the changes 
in earlier steps to make (3) simple. Albert had already drafted steps (1) and 
(2) but did so in ways that duplicated code and would have complicated (3).
  
Paul Eggert July 6, 2018, 10:54 p.m. UTC | #15
Albert ARIBAUD wrote:
> Do code syncs from glibc to gnulib happen on new glibc releases?

It depends. Some of Gnulib syncs soon after a commit to glibc master. Other 
parts of Gnulib sync more lackadasically. Glibc release boundaries don't matter 
much.

Conversely, sometimes Glibc syncs files from Gnulib. Though this is rarer, it 
happened a couple of days ago:

https://sourceware.org/git/?p=glibc.git;a=commit;h=eb04c21373e2a2885f3d52ff192b0499afe3c672

Come to think of it, we should update config/srclist.txt accordingly; I'll do 
that shortly.

> If they do, then my chick-and-egg problem remains

I don't quite know what you mean by chicken-and-egg problem, but it sounds like 
we don't have the sort of blockage that you're worried about.
  
Joseph Myers July 17, 2018, 8:40 p.m. UTC | #16
On Thu, 5 Jul 2018, Paul Eggert wrote:

> Although Gnulib hasn't needed a difftime module yet, it might need one once
> this 32- vs 64-bit time_t stuff lands into glibc, so let's keep difftime.c
> usable for Gnulib.

I don't think we should make such requests of contributers for code that 
is not shared with another project.  For code that is actually shared with 
gnulib (possibly with some divergence on both sides), yes, but not for 
arbitrary other functions simply because they might hypothetically be 
useful in other context in future.
  

Patch

From b6120a13a752335bc1c60d5d22a1cdb0b4403a7a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Wed, 20 Jun 2018 11:37:54 -0700
Subject: [PATCH 2/2] difftime: new __time64_t variant

* include/time.h (__difftime64): New macro or function decl.
* time/difftime.c: Include "time-internal.h" if not glibc, for
outside-glibc uses.
(subtract, __difftime64) Use __time64_t, not time_t.
(__difftime) [__TIMESIZAE != 64]: Turn into a wrapper function
for __difftime64, a new function that contains the old body
of __difftime.
---
 include/time.h  |  6 ++++++
 time/difftime.c | 36 ++++++++++++++++++++++++++----------
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/include/time.h b/include/time.h
index 5afb12305e..63cc855749 100644
--- a/include/time.h
+++ b/include/time.h
@@ -125,6 +125,12 @@  extern char * __strptime_internal (const char *rp, const char *fmt,
 				   struct tm *tm, void *statep,
 				   locale_t locparam) attribute_hidden;
 
+#if __TIMESIZE == 64
+# define __difftime64 __difftime
+#else
+extern double __difftime64 (__time64_t time1, __time64_t time0);
+#endif
+
 extern double __difftime (time_t time1, time_t time0);
 
 /* Use in the clock_* functions.  Size of the field representing the
diff --git a/time/difftime.c b/time/difftime.c
index 7727f5cdb5..4e765de624 100644
--- a/time/difftime.c
+++ b/time/difftime.c
@@ -23,19 +23,23 @@ 
 #include <float.h>
 #include <stdint.h>
 
+#ifndef _LIBC
+# include "time-internal.h"
+#endif
+
 #define TYPE_BITS(type) (sizeof (type) * CHAR_BIT)
 #define TYPE_SIGNED(type) ((type) -1 < 0)
 
 /* Return the difference between TIME1 and TIME0, where TIME0 <= TIME1.  */
 
 static double
-subtract (time_t time1, time_t time0)
+subtract (__time64_t time1, __time64_t time0)
 {
-  if (! TYPE_SIGNED (time_t))
+  if (! TYPE_SIGNED (__time64_t))
     return time1 - time0;
   else
     {
-      /* Optimize the common special cases where time_t
+      /* Optimize the common special cases where __time64_t
 	 can be converted to uintmax_t without losing information.  */
       uintmax_t dt = (uintmax_t) time1 - (uintmax_t) time0;
       double delta = dt;
@@ -43,7 +47,7 @@  subtract (time_t time1, time_t time0)
       if (UINTMAX_MAX / 2 < INTMAX_MAX)
 	{
 	  /* This is a rare host where uintmax_t has padding bits, and possibly
-	     information was lost when converting time_t to uintmax_t.
+	     information was lost when converting __time64_t to uintmax_t.
 	     Check for overflow by comparing dt/2 to (time1/2 - time0/2).
 	     Overflow occurred if they differ by more than a small slop.
 	     Thanks to Clive D.W. Feather for detailed technical advice about
@@ -74,9 +78,9 @@  subtract (time_t time1, time_t time0)
              1 is unsigned in C, so it need not be compared to zero.  */
 
 	  uintmax_t hdt = dt / 2;
-	  time_t ht1 = time1 / 2;
-	  time_t ht0 = time0 / 2;
-	  time_t dht = ht1 - ht0;
+	  __time64_t ht1 = time1 / 2;
+	  __time64_t ht0 = time0 / 2;
+	  __time64_t dht = ht1 - ht0;
 
 	  if (2 < dht - hdt + 1)
 	    {
@@ -97,17 +101,17 @@  subtract (time_t time1, time_t time0)
 
 /* Return the difference between TIME1 and TIME0.  */
 double
-__difftime (time_t time1, time_t time0)
+__difftime64 (__time64_t time1, __time64_t time0)
 {
   /* Convert to double and then subtract if no double-rounding error could
      result.  */
 
-  if (TYPE_BITS (time_t) <= DBL_MANT_DIG)
+  if (TYPE_BITS (__time64_t) <= DBL_MANT_DIG)
     return (double) time1 - (double) time0;
 
   /* Likewise for long double.  */
 
-  if (TYPE_BITS (time_t) <= LDBL_MANT_DIG)
+  if (TYPE_BITS (__time64_t) <= LDBL_MANT_DIG)
     return (long double) time1 - (long double) time0;
 
   /* Subtract the smaller integer from the larger, convert the difference to
@@ -115,4 +119,16 @@  __difftime (time_t time1, time_t time0)
 
   return time1 < time0 ? - subtract (time0, time1) : subtract (time1, time0);
 }
+
+#if __TIMESIZE != 64
+
+/* Return the difference between TIME1 and TIME0.  */
+double
+__difftime (time_t time1, time_t time0)
+{
+  return __difftime64 (time1, time0);
+}
+
+#endif
+
 strong_alias (__difftime, difftime)
-- 
2.17.1