manual: Add more documentation for the tm_isdst member of struct tm

Message ID 87zgdl4c75.fsf@oldenburg.str.redhat.com
State New
Headers
Series manual: Add more documentation for the tm_isdst member of struct tm |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686
redhat-pt-bot/TryBot-still_applies warning Patch no longer applies to master

Commit Message

Florian Weimer Oct. 24, 2022, 12:51 p.m. UTC
  The meaning of the member has changed implicitly since the IANA
tz database started releasing zone data with negative DST.

---
 manual/time.texi | 41 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 36 insertions(+), 5 deletions(-)


base-commit: 8775479804cfea2bbe4dcdf19d6589264c96d5fb
  

Comments

Paul Eggert Oct. 24, 2022, 7:44 p.m. UTC | #1
Thanks for looking into the documentation problem here. I hope you don't 
mind some detailed comments.

On 2022-10-24 05:51, Florian Weimer via Libc-alpha wrote:

> +This field serves multiple purposes.  Historically, it is related to
> +daylight saving time.

The field is still related to daylight saving time, as described below, 
so I suggest rewording this to something like:

-----
This field indicates whether the daylight saving time is in effect.
-----


> +When clocks move backwards (due to time zone changes, or due to the
> +transition from daylight saving time to standard time), functions
> +converting to a broken-down time value such as @code{localtime} may
> +produce the same @code{tm_hour}, @code{tm_min}, @code{tm_sec} values for
> +different input times.  The @code{tm_isdst} field is used to
> +disambiguate these different points in time.

Unfortunately tm_dst cannot always disambiguate these points. Although 
disambiguation works when the ambiguity is due to a DST fall-back 
transition, it cannot disambiguate in other cases. For example, with 
TZ='Europe/Volgograd', the timestamp 2020-12-27 01:30 is ambiguous but 
both times have tm_isdst == 0; this is because Volgograd permanently 
switched from +04 to +03 that day so standard time was in use both 
before and after the clocks were changed.


   Historically, a positive
> +value indicates Daylight Saving Time is in effect, and a zero value that
> +it is not.

That's still the case. POSIX sometimes calls it the "Daylight Savings" 
flag, and sometimes "alternative time"; but either way it's the same thing.


> However, this interpretation depends on the data present in
> +the system's time zone database.

It's not simply TZDB; it can happen even if there is no TZDB installed, 
with a POSIX TZ string like TZ='IST-1GMT0,M10.5.0,M3.5.0/1' (valid for 
timestamps in Ireland after the year 1996).


> For example, for some time zones, the
> +role of positive and zero @code{tm_isdst} values are swapped in some
> +years.

The role is not swapped.  Some TZ settings have a negative DST offset; 
that is, their UT offset is smaller when tm_isdst is positive, than when 
tm_isdst is zero.

Perhaps rephrase the above to something like the following?

-----
A positive value indicates that daylight saving time (sometimes called 
"alternative time") is in effect. Although this is typically an hour 
further east of Greenwich than standard time, other DST offsets from 
standard time are possible, and the DST offset need not be positive in 
locations that observe DST in winter or during holidays.

In practice, the @code{tm_isdst} value is mostly obsolescent. It should 
not be used to calculate time zone names or UT offsets, as the 
@code{tm_zone} and @code{tm_gmtoff} values do this more easily and more 
reliably. Its only remaining role is as input to @code{mktime}, where it 
can disambiguate timestamps, though unfortunately this does not work in 
all cases.
----


> +If the @code{tm_isdst} member is not negative, its value should match a
> +previous result from @code{localtime}, @code{localtime_r},
> +@code{gmtime}, or @code{gmtime_r}.

This requirement is not in POSIX or the C Standard. Shouldn't we reword 
this to merely talk about positive versus zero versus negative values 
for tm_isdst?


> +...  Given that the
> +system time zone database does not necessarily map @code{tm_isdst}
> +values to popular notions of daylight saving time and standard time,
> +using a negative @code{tm_isdst} value as input to @code{mktime} is
> +generally preferable to letting the user input whether daylight saving
> +time is in effect or not.

Not sure I'd use the word "popular" here. Also, this should mention 
issues like the Volgograd one. How about replacing the above with the 
following:

-----
When your application does not know whether an ambiguous timestamp 
occurred before or after the clock moved back due to a DST fallback 
transition, it can use a negative @code{tm_isdst} to let @code{mktime} 
deduce which of the two values to return. Unfortunately, this approach 
does not work in the rarer case where the clock moves back due to a 
permanent change in the time zone; to avoid ambiguity in this situation 
you can instead record the timestamp in UTC and convert to local time 
for display only.
-----


> -structure, including the members that were initially ignored.
> +structure, including the members that were initially ignored.  The
> +@code{tm_isdst} member is updated to reflect the time zone status of the
> +system time zone database at the specified time.

Since there may be no TZDB installed, I suggest rewording the last 
phrase to "to reflect whether daylight saving time is in effect".
  
Florian Weimer Oct. 24, 2022, 9:13 p.m. UTC | #2
* Paul Eggert:

>> +When clocks move backwards (due to time zone changes, or due to the
>> +transition from daylight saving time to standard time), functions
>> +converting to a broken-down time value such as @code{localtime} may
>> +produce the same @code{tm_hour}, @code{tm_min}, @code{tm_sec} values for
>> +different input times.  The @code{tm_isdst} field is used to
>> +disambiguate these different points in time.
> 
> Unfortunately tm_dst cannot always disambiguate these points. Although
> disambiguation works when the ambiguity is due to a DST fall-back 
> transition, it cannot disambiguate in other cases. For example, with
> TZ='Europe/Volgograd', the timestamp 2020-12-27 01:30 is ambiguous but 
> both times have tm_isdst == 0; this is because Volgograd permanently
> switched from +04 to +03 that day so standard time was in use both 
> before and after the clocks were changed.

Huh.  I genuinely thought that you'd use tm_isdst to disambiguate such
cases.

>> For example, for some time zones, the
>> +role of positive and zero @code{tm_isdst} values are swapped in some
>> +years.
>
> The role is not swapped.  Some TZ settings have a negative DST offset;
> that is, their UT offset is smaller when tm_isdst is positive, than
> when tm_isdst is zero.

I don't know what to make of this:

$ zdump -v Europe/Dublin | grep 2022
Europe/Dublin  Sun Mar 27 00:59:59 2022 UT = Sun Mar 27 00:59:59 2022 GMT isdst=1 gmtoff=0
Europe/Dublin  Sun Mar 27 01:00:00 2022 UT = Sun Mar 27 02:00:00 2022 IST isdst=0 gmtoff=3600
Europe/Dublin  Sun Oct 30 00:59:59 2022 UT = Sun Oct 30 01:59:59 2022 IST isdst=0 gmtoff=3600
Europe/Dublin  Sun Oct 30 01:00:00 2022 UT = Sun Oct 30 01:00:00 2022 GMT isdst=1 gmtoff=0

#include <err.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

And this:

int
main (void)
{
  if (setenv ("TZ", "Europe/Dublin", 1) != 0)
    err (1, "setenv");
  time_t secs = 1666644387;
  struct tm *tm = localtime (&secs);
  if (tm == NULL)
    err (1, "localtime");
  printf ("%04d-%02d-%02dT%02d:%02d:%02d %d\n",
          tm->tm_year + 1900, tm->tm_mon + 1, tm->tm_mday,
          tm->tm_hour, tm->tm_min, tm->tm_sec, tm->tm_isdst);
}

prints (all of this is on Fedora):

2022-10-24T21:46:27 0

As far as I can tell, that's rather inconsistent with civilian use.
Directive 2000/84/EC also calls this “summer time”.

I thought it was due to a Volgograd-style disambiguation (but didn't
actually check).  If this doesn't come from that and merely serves to
express that IST is *not* daylight saving time (which is at least
debatable), why isn't isdst zero during the winter as well?

And if Ireland doesn't obey Daylight Saving Time in recent times,
shouldn't tzset set the daylight variable to 0 if the TZ data is trimmed
to post-1970 dates?

Thanks,
Florian
  
Paul Eggert Oct. 25, 2022, 5:49 p.m. UTC | #3
On 2022-10-24 14:13, Florian Weimer wrote:
> that's rather inconsistent with civilian use.

It depends on which civilians you talk to. Irish legislation says that 
IST (UTC+01) is Irish Standard Time, and that Ireland observes "winter 
time" (i.e., negative daylight saving offset) during winter. Irish 
winter time is called "Greenwich mean time". See:

https://github.com/eggert/tz/blob/main/europe#L359
https://www.irishstatutebook.ie/eli/1971/act/17/enacted/en/print

There's something similar in Morocco, though in their case it's Ramadan 
time not winter time.


> why isn't isdst zero during the winter as well?

Because Irish daylight saving time is observed in winter not summer. 
Also, if tm_isdst were always zero, it'd be harder for mktime to 
disambiguate Irish timestamps.

POSIX requires support for this sort of thing. Even if TZDB's 
Europe/Dublin entry were changed to disagree with Irish legislation, 
glibc would still need to support settings like this:

TZ='IST-1GMT0,M10.5.0,M3.5.0/1'

This POSIX-conforming TZ setting follows current Irish practice and 
specifies negative DST during winter.

Perhaps we could mention this possibility in the manual.

PS. While we're on the subject of documentation, perhaps we should also 
mention that mktime can use tm_gmtoff to disambiguate 
otherwise-ambiguous time stamps. This is reliable even for situations 
like Volgograd in 2020, and it conforms to POSIX. A bit tricky to 
document, though.
  
Florian Weimer Oct. 25, 2022, 6:07 p.m. UTC | #4
* Paul Eggert:

> On 2022-10-24 14:13, Florian Weimer wrote:
>> that's rather inconsistent with civilian use.
>
> It depends on which civilians you talk to. Irish legislation says that
> IST (UTC+01) is Irish Standard Time, and that Ireland observes "winter 
> time" (i.e., negative daylight saving offset) during winter. Irish
> winter time is called "Greenwich mean time". See:
>
> https://github.com/eggert/tz/blob/main/europe#L359
> https://www.irishstatutebook.ie/eli/1971/act/17/enacted/en/print

I think in other cases, tz goes with actual practices on the ground, not
what the internationally recognized government says what the time should
be.  Just saying.

> There's something similar in Morocco, though in their case it's
> Ramadan time not winter time.
>
>> why isn't isdst zero during the winter as well?
>
> Because Irish daylight saving time is observed in winter not
> summer.

This seems to be a bit of a jump?  Is it an inference from the fact that
summer time is called “Irish Standard Time”, and that the Act gives
permission to change or abolish winter time, but not standard time?  So
winter time is presented as an exception.  Daylight saving time is also
presented as an exception, therefore winter time is daylight saving
time?

I'm just trying to follow the line of though.

> PS. While we're on the subject of documentation, perhaps we should
> also mention that mktime can use tm_gmtoff to disambiguate 
> otherwise-ambiguous time stamps. This is reliable even for situations
> like Volgograd in 2020, and it conforms to POSIX. A bit tricky to 
> document, though.

Huh.  How can we do this tm_gmtoff thing without breaking C
compatibility?  I'm worried that a conforming application might not
initialize tm_gmtoff (or rather __tm_gmtoff) properly.

Thanks,
Florian
  
Paul Eggert Oct. 25, 2022, 6:27 p.m. UTC | #5
On 2022-10-25 11:07, Florian Weimer wrote:

> I think in other cases, tz goes with actual practices on the ground

There is no real practice on the ground here, as nobody outside of libc 
nerds cares (or should care) whether tm_isdst is zero or positive.


> I'm just trying to follow the line of thought.

It's the same line of thought that Morocco uses. The ordinary time is 
the time observed most of the year. The unusual time is daylight saving 
time (or "alternative time", as POSIX sometimes puts it) and is observed 
during winter, or during Ramadan, or whatever. Other countries have done 
similar things in the past.


>> PS. While we're on the subject of documentation, perhaps we should
>> also mention that mktime can use tm_gmtoff to disambiguate
>> otherwise-ambiguous time stamps. This is reliable even for situations
>> like Volgograd in 2020, and it conforms to POSIX. A bit tricky to
>> document, though.
> 
> Huh.  How can we do this tm_gmtoff thing without breaking C
> compatibility?  I'm worried that a conforming application might not
> initialize tm_gmtoff (or rather __tm_gmtoff) properly.

It doesn't break compatibility if tm_gmtoff is inspected only when 
tm_isdst doesn't resolve the ambiguity. At that point, if tm_gmtoff 
matches the tm_gmtoff of one of the two plausible timestamps, mktime can 
use that info to choose which one. It's OK if the input tm_gmtoff is 
uninitialized, as mktime could choose that one anyway. (Though it's 
officially undefined behavior to access uninitialized memory, in 
practice it's OK here.)

With this approach, mktime is always the inverse of localtime even in 
cases like Volgograd, which is a clear win. Cute, huh? Though not 
currently documented.
  
Florian Weimer Oct. 26, 2022, 11:21 a.m. UTC | #6
* Paul Eggert:

> On 2022-10-25 11:07, Florian Weimer wrote:
>
>> I think in other cases, tz goes with actual practices on the ground
>
> There is no real practice on the ground here, as nobody outside of
> libc nerds cares (or should care) whether tm_isdst is zero or
> positive.

That's not my experience.  We have heard from customers that they use
non-negative values in their applications.

Maybe we should just document that applications should set this field to
-1 when constructing struct tm data?

>> I'm just trying to follow the line of thought.
>
> It's the same line of thought that Morocco uses. The ordinary time is
> the time observed most of the year. The unusual time is daylight
> saving time (or "alternative time", as POSIX sometimes puts it) and is
> observed during winter, or during Ramadan, or whatever. Other
> countries have done similar things in the past.

In case of Ireland it seems an artificial complication, though.  The
perception seems to be that IST is summer time, not standard time, and
winter time is not the exception.

>>> PS. While we're on the subject of documentation, perhaps we should
>>> also mention that mktime can use tm_gmtoff to disambiguate
>>> otherwise-ambiguous time stamps. This is reliable even for situations
>>> like Volgograd in 2020, and it conforms to POSIX. A bit tricky to
>>> document, though.
>> Huh.  How can we do this tm_gmtoff thing without breaking C
>> compatibility?  I'm worried that a conforming application might not
>> initialize tm_gmtoff (or rather __tm_gmtoff) properly.
>
> It doesn't break compatibility if tm_gmtoff is inspected only when
> tm_isdst doesn't resolve the ambiguity. At that point, if tm_gmtoff 
> matches the tm_gmtoff of one of the two plausible timestamps, mktime
> can use that info to choose which one. It's OK if the input tm_gmtoff
> is uninitialized, as mktime could choose that one anyway. (Though it's 
> officially undefined behavior to access uninitialized memory, in
> practice it's OK here.)
>
> With this approach, mktime is always the inverse of localtime even in
> cases like Volgograd, which is a clear win. Cute, huh? Though not 
> currently documented.

Okay, I don't like it, but it seems to be sort-of okay-ish.

Thanks,
Florian
  
Paul Eggert Oct. 26, 2022, 7:05 p.m. UTC | #7
On 2022-10-26 04:21, Florian Weimer wrote:
> * Paul Eggert:
> 
>> On 2022-10-25 11:07, Florian Weimer wrote:
>>
>>> I think in other cases, tz goes with actual practices on the ground
>>
>> There is no real practice on the ground here, as nobody outside of
>> libc nerds cares (or should care) whether tm_isdst is zero or
>> positive.
> 
> That's not my experience.  We have heard from customers that they use
> non-negative values in their applications.

Oh, by "practice on the ground" I meant ordinary people and mass media 
and the like, not software developers who have to deal with the tm_isdst 
mess.


> Maybe we should just document that applications should set this field to
> -1 when constructing struct tm data?

Yes, if you're deriving struct tm data from the outside world and you 
have no idea what tm_isdst etc. should be, your apps should set tm_isdst 
to -1. However, if the outside world tells you tm_gmtoff (which is 
pretty common these days, e.g., see the "Date:" line in this email), 
then your apps should also set tm_gmtoff to what the outside world tells 
you, before you call mktime.


> In case of Ireland it seems an artificial complication, though.  The
> perception seems to be that IST is summer time, not standard time

My impression is different, in that people in Ireland typically say 
"summer time" or "winter time". They typically do not say "standard 
time" or "daylight saving time". And as it happens, summer time = 
standard time in Ireland.

Part of the confusion here is that the British English phrase "summer 
time" is not the same thing as the American English phrase "daylight 
saving time". The two phrases mean the same thing for UK and US 
timekeeping, but not for Irish timekeeping. Since the Irish typically 
use British English, they refer to their timekeeping correctly.

I think this confusion is partly why POSIX sometimes says "alternative 
time" instead of "daylight saving time" - it's trying to avoid using 
either British or American English and thus avoid the confusion. Though 
I wish POSIX would stick to just one phrase rather than using two, since 
the two phrases in POSIX mean the same thing. I expect the problem here 
is that the C Standard says "daylight saving time" and POSIX copies from 
that when it has to, and uses "alternative time" when it doesn't. Though 
the use of two terms in POSIX simply adds to the confusion....
  
Florian Weimer Oct. 27, 2022, 10:09 a.m. UTC | #8
* Paul Eggert:

>> Maybe we should just document that applications should set this field to
>> -1 when constructing struct tm data?
>
> Yes, if you're deriving struct tm data from the outside world and you
> have no idea what tm_isdst etc. should be, your apps should set
> tm_isdst to -1. However, if the outside world tells you tm_gmtoff
> (which is pretty common these days, e.g., see the "Date:" line in this
> email), then your apps should also set tm_gmtoff to what the outside
> world tells you, before you call mktime.

Okay, that would indeed be helpful advice.  I'll see if I can come up
with a new patch.

>> In case of Ireland it seems an artificial complication, though.  The
>> perception seems to be that IST is summer time, not standard time
>
> My impression is different, in that people in Ireland typically say
> "summer time" or "winter time". They typically do not say "standard 
> time" or "daylight saving time". And as it happens, summer time =
> standard time in Ireland.

I think RTE is not exactly fringe media in Ireland, and they recently
wrote (largely copying from last year I admittedly):

| When do the clocks change in Ireland in 2022?
|
| Daylight saving time in Ireland began (went forward an hour) at 01:00
| on Sunday, 27 March 2022 and will end (go back an hour) at 02:00 on
| Sunday, 30 October 2022.

<https://www.rte.ie/lifestyle/living/2022/0309/1285354-everything-you-need-to-know-about-the-clocks-changing-in-2022/>

This is not just one journalist making an error.  I think the cultural
convention *is* different.

Thanks,
Florian
  

Patch

diff --git a/manual/time.texi b/manual/time.texi
index 0c7a942b4c..8632bd9b77 100644
--- a/manual/time.texi
+++ b/manual/time.texi
@@ -1010,10 +1010,24 @@  range @code{0} through @code{365}).
 @item int tm_isdst
 @cindex Daylight Saving Time
 @cindex summer time
-This is a flag that indicates whether Daylight Saving Time is (or was, or
-will be) in effect at the time described.  The value is positive if
-Daylight Saving Time is in effect, zero if it is not, and negative if the
-information is not available.
+This field serves multiple purposes.  Historically, it is related to
+daylight saving time.
+
+When clocks move backwards (due to time zone changes, or due to the
+transition from daylight saving time to standard time), functions
+converting to a broken-down time value such as @code{localtime} may
+produce the same @code{tm_hour}, @code{tm_min}, @code{tm_sec} values for
+different input times.  The @code{tm_isdst} field is used to
+disambiguate these different points in time.  Historically, a positive
+value indicates Daylight Saving Time is in effect, and a zero value that
+it is not.  However, this interpretation depends on the data present in
+the system's time zone database.  For example, for some time zones, the
+role of positive and zero @code{tm_isdst} values are swapped in some
+years.
+
+If the broken-down time is used as an input for a conversion function in
+the other direction, such as @code{mktime}, a negative value can be
+used to indicate that the information is not available.
 
 @item long int tm_gmtoff
 This field describes the time zone that was used to compute this
@@ -1219,6 +1233,21 @@  simple time representation.  It also normalizes the contents of the
 broken-down time structure, and fills in some components based on the
 values of the others.
 
+If the @code{tm_isdst} member is not negative, its value should match a
+previous result from @code{localtime}, @code{localtime_r},
+@code{gmtime}, or @code{gmtime_r}.  If @code{tm_isdst} is negative and
+the calendar time is not ambiguous, @code{mktime} will use this time.
+For ambiguous inputs with a negative @code{tm_isdst} value, an arbitrary
+choice is made.  If a non-negative value was specified for
+@code{tm_isdst} and the system time zone database contains data
+conflicting with the request in @code{tm_isdst}, @code{mktime} attempts
+to adjust the computed result in an unspecified way.  Given that the
+system time zone database does not necessarily map @code{tm_isdst}
+values to popular notions of daylight saving time and standard time,
+using a negative @code{tm_isdst} value as input to @code{mktime} is
+generally preferable to letting the user input whether daylight saving
+time is in effect or not.
+
 The @code{mktime} function ignores the specified contents of the
 @code{tm_wday}, @code{tm_yday}, @code{tm_gmtoff}, and @code{tm_zone}
 members of the broken-down time
@@ -1226,7 +1255,9 @@  structure.  It uses the values of the other components to determine the
 calendar time; it's permissible for these components to have
 unnormalized values outside their normal ranges.  The last thing that
 @code{mktime} does is adjust the components of the @var{brokentime}
-structure, including the members that were initially ignored.
+structure, including the members that were initially ignored.  The
+@code{tm_isdst} member is updated to reflect the time zone status of the
+system time zone database at the specified time.
 
 If the specified broken-down time cannot be represented as a simple time,
 @code{mktime} returns a value of @code{(time_t)(-1)} and does not modify