Message ID | 20190211163728.31655-1-eggert@cs.ucla.edu |
---|---|
State | Rejected |
Headers |
Received: (qmail 64173 invoked by alias); 11 Feb 2019 16:37:41 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 64164 invoked by uid 89); 11 Feb 2019 16:37:41 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.2 spammy=NEWS, affecting, universal, Universal X-HELO: zimbra.cs.ucla.edu From: Paul Eggert <eggert@cs.ucla.edu> To: libc-alpha@sourceware.org Cc: Paul Eggert <eggert@cs.ucla.edu> Subject: [PATCH] Fix multiple minor tzset glitches [BZ #24004] Date: Mon, 11 Feb 2019 08:37:28 -0800 Message-Id: <20190211163728.31655-1-eggert@cs.ucla.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable |
Commit Message
Paul Eggert
Feb. 11, 2019, 4:37 p.m. UTC
Avoid polling if TZ is unset. Consistently ignore leading ":" in TZ as per documentation. Treat TZ="" as Universal Time without leap seconds, for consistency with upstream. * time/tzset.c (tzset_internal): Check for null TZ first. Do not substitute "Universal" for ""; this is __tzfile_read’s job and we were doing it incorrectly as we treated TZ="" differently from TZ=":". * time/tzfile.c (__tzfile_read): Simplify, as FILE cannot be null. --- ChangeLog | 13 +++++++++++++ NEWS | 7 +++++++ time/tzfile.c | 5 +---- time/tzset.c | 15 ++++++--------- 4 files changed, 27 insertions(+), 13 deletions(-)
Comments
* Paul Eggert:
> Avoid polling if TZ is unset.
I thought this polling (for reloading /etc/localtime if it has changed)
was a feature. I'm not sure if we can remove it.
Thanks,
Florian
On 2/18/19 1:37 AM, Florian Weimer wrote: > I thought this polling (for reloading /etc/localtime if it has changed) > was a feature. I'm not sure if we can remove it. No such polling is documented. On the contrary, the glibc manual says that an unset TZ is treated as if it were set to ':/etc/localtime'. There is no polling when you set TZ=':/etc/localtime' (or to TZ='/etc/localtime', or to any other value). So the code and the documentation disagree, and one or the other needs to be fixed. Neither Solaris nor OpenBSD poll, so portable programs cannot assume that polling occurs. (These are the only two other systems I checked.) Commentary like this: https://blog.packagecloud.io/eng/2017/02/21/set-environment-variable-save-thousands-of-system-calls/ says that polling is a bad idea, and gives TZ=':/etc/localtime' as a workaround for glibc. I'm sympathetic, as the polling makes glibc look bad compared to the competition.
* Paul Eggert: > On 2/18/19 1:37 AM, Florian Weimer wrote: >> I thought this polling (for reloading /etc/localtime if it has changed) >> was a feature. I'm not sure if we can remove it. > > No such polling is documented. On the contrary, the glibc manual says > that an unset TZ is treated as if it were set to > ':/etc/localtime'. There is no polling when you set > TZ=':/etc/localtime' (or to TZ='/etc/localtime', or to any other > value). So the code and the documentation disagree, and one or the > other needs to be fixed. > > Neither Solaris nor OpenBSD poll, so portable programs cannot assume > that polling occurs. (These are the only two other systems I checked.) > > Commentary like this: > > https://blog.packagecloud.io/eng/2017/02/21/set-environment-variable-save-thousands-of-system-calls/ > > says that polling is a bad idea, and gives TZ=':/etc/localtime' as a > workaround for glibc. I'm sympathetic, as the polling makes glibc look > bad compared to the competition. On the other hand, for files which we do not poll for changes (such as /etc/resolv.conf previously, and /etc/gai.conf and /etc/nsswitch.conf presently), we regularly receive requests to add such polling. For nsswitch.conf, this is clearly documented in the manual page, and yet people still ask for it. Thanks, Florian
On 3/13/19 7:52 AM, Florian Weimer wrote: > On the other hand, for files which we do not poll for changes (such as > /etc/resolv.conf previously, and /etc/gai.conf and /etc/nsswitch.conf > presently), we regularly receive requests to add such polling. Yes, and it would be good to add a way to control polling of /etc files, and whatever method is used to control polling of /etc/nsswitch.conf etc. could also be used to control polling of /etc/localtime. Still, the current situation with /etc/localtime is not good: it disagrees with the documentation and with common practice elsewhere. So I suggest that we fix it and also look into adding an API to control polling. One API might be something like a new include file etcpoll.h with an API like this: etcpoll (ETC_NSSWITCH_CONF, ETCPOLL_NOW); etcpoll (ETC_GAI_CONF, ETCPOLL_ALWAYS); etcpoll (ETC_LOCALTIME, ETCPOLL_DISABLE); where NOW means reconsult the file now, ALWAYS means always reconsult the file before every use of its data, and DISABLE means stop reconsulting. That sort of thing.
* Paul Eggert: > On 3/13/19 7:52 AM, Florian Weimer wrote: >> On the other hand, for files which we do not poll for changes (such as >> /etc/resolv.conf previously, and /etc/gai.conf and /etc/nsswitch.conf >> presently), we regularly receive requests to add such polling. > > Yes, and it would be good to add a way to control polling of /etc files, > and whatever method is used to control polling of /etc/nsswitch.conf > etc. could also be used to control polling of /etc/localtime. Still, the > current situation with /etc/localtime is not good: it disagrees with the > documentation and with common practice elsewhere. So I suggest that we > fix it and also look into adding an API to control polling. > > One API might be something like a new include file etcpoll.h with an API > like this: > > etcpoll (ETC_NSSWITCH_CONF, ETCPOLL_NOW); > etcpoll (ETC_GAI_CONF, ETCPOLL_ALWAYS); > etcpoll (ETC_LOCALTIME, ETCPOLL_DISABLE); > > where NOW means reconsult the file now, ALWAYS means always reconsult > the file before every use of its data, and DISABLE means stop > reconsulting. That sort of thing. I think this should be a configuration file. We can use a shared mapping (or use /etc/ld.so.cache) to make up-to-date checks really, really cheap. But I think we should apply this mechanism only to files where the system administrator has enabled this feature. Thanks, Florian
On 3/14/19 7:10 AM, Florian Weimer wrote: > * Paul Eggert: > >> Yes, and it would be good to add a way to control polling of /etc files, >> and whatever method is used to control polling of /etc/nsswitch.conf >> etc. could also be used to control polling of /etc/localtime. > I think this should be a configuration file. We can use a shared > mapping (or use /etc/ld.so.cache) to make up-to-date checks really, > really cheap. But I think we should apply this mechanism only to files > where the system administrator has enabled this feature. Will the configuration file will have also an entry to control whether the configuration file itself is polled? :-) Come to think of it, this has the same feel as the other tunables, so perhaps it'd be better to add it to the tunables feature. Something like this: export GLIBC_TUNABLES='glibc.etcpoll.nsswitch=1' in order to enable polling of /etc/nsswitch.conf.
On 14/03/2019 13:47, Paul Eggert wrote: > On 3/14/19 7:10 AM, Florian Weimer wrote: >> * Paul Eggert: >> >>> Yes, and it would be good to add a way to control polling of /etc files, >>> and whatever method is used to control polling of /etc/nsswitch.conf >>> etc. could also be used to control polling of /etc/localtime. >> I think this should be a configuration file. We can use a shared >> mapping (or use /etc/ld.so.cache) to make up-to-date checks really, >> really cheap. But I think we should apply this mechanism only to files >> where the system administrator has enabled this feature. > > Will the configuration file will have also an entry to control whether > the configuration file itself is polled? :-) > > Come to think of it, this has the same feel as the other tunables, so > perhaps it'd be better to add it to the tunables feature. Something like > this: > > export GLIBC_TUNABLES='glibc.etcpoll.nsswitch=1' > > in order to enable polling of /etc/nsswitch.conf. > +1 on making this a tunable. Adding another configure switch means another build permutation to check for consistency.
* Paul Eggert: > On 3/14/19 7:10 AM, Florian Weimer wrote: >> * Paul Eggert: >> >>> Yes, and it would be good to add a way to control polling of /etc files, >>> and whatever method is used to control polling of /etc/nsswitch.conf >>> etc. could also be used to control polling of /etc/localtime. >> I think this should be a configuration file. We can use a shared >> mapping (or use /etc/ld.so.cache) to make up-to-date checks really, >> really cheap. But I think we should apply this mechanism only to files >> where the system administrator has enabled this feature. > > Will the configuration file will have also an entry to control whether > the configuration file itself is polled? :-) With the mapped file, there is no file polling as such, just a few memory loads. The cost is that the system administrator needs to run a command after updating any of the covered configuration files, which is why this has to be optional (and opt-in per configuration file). There probably aren't that many configuration files where this is needed. Thanks, Florian
Hi, Le jeudi 14 mars 2019 à 19:48 +0100, Florian Weimer a écrit : > * Paul Eggert: > > On 3/14/19 7:10 AM, Florian Weimer wrote: > > > * Paul Eggert: > > > > > > > Yes, and it would be good to add a way to control polling of /etc files, > > > > and whatever method is used to control polling of /etc/nsswitch.conf > > > > etc. could also be used to control polling of /etc/localtime. > > > I think this should be a configuration file. We can use a shared > > > mapping (or use /etc/ld.so.cache) to make up-to-date checks really, > > > really cheap. But I think we should apply this mechanism only to files > > > where the system administrator has enabled this feature. > > > > Will the configuration file will have also an entry to control whether > > the configuration file itself is polled? :-) > > With the mapped file, there is no file polling as such, just a few > memory loads. The cost is that the system administrator needs to run a > command after updating any of the covered configuration files, which is > why this has to be optional (and opt-in per configuration file). There > probably aren't that many configuration files where this is needed. > I think there's a caveat with memory mapped file: if a new version of the configuration file is created and renamed to the configuration location, replacing the previous one, applications won't notice the update, as the memory mapping will still reference the previous file. Regards.
* Yann Droneaud: >> With the mapped file, there is no file polling as such, just a few >> memory loads. The cost is that the system administrator needs to run a >> command after updating any of the covered configuration files, which is >> why this has to be optional (and opt-in per configuration file). There >> probably aren't that many configuration files where this is needed. > > I think there's a caveat with memory mapped file: if a new version of > the configuration file is created and renamed to the configuration > location, replacing the previous one, applications won't notice the > update, as the memory mapping will still reference the previous file. Sure, but the configuration compiler can write a flag to the old mapping that indicates it is out of date. Checking that flag has to be part of the sequence that verifies if polling is required. I believe nscd does something similar when it invalidates a mapping.
Hi, Le vendredi 15 mars 2019 à 16:05 +0100, Florian Weimer a écrit : > * Yann Droneaud: > > > > With the mapped file, there is no file polling as such, just a few > > > memory loads. The cost is that the system administrator needs to run a > > > command after updating any of the covered configuration files, which is > > > why this has to be optional (and opt-in per configuration file). There > > > probably aren't that many configuration files where this is needed. > > > > I think there's a caveat with memory mapped file: if a new version of > > the configuration file is created and renamed to the configuration > > location, replacing the previous one, applications won't notice the > > update, as the memory mapping will still reference the previous file. > > Sure, but the configuration compiler can write a flag to the old > mapping that indicates it is out of date. Checking that flag has to > be part of the sequence that verifies if polling is required. > Yes, this address my concern. Thanks. (Assuming the configuration compiler is used, and not a random stackoverflow copy pasted shell script ;) Regards.
* Paul Eggert: > On 2/18/19 1:37 AM, Florian Weimer wrote: >> I thought this polling (for reloading /etc/localtime if it has changed) >> was a feature. I'm not sure if we can remove it. > > No such polling is documented. On the contrary, the glibc manual says > that an unset TZ is treated as if it were set to > ':/etc/localtime'. There is no polling when you set > TZ=':/etc/localtime' (or to TZ='/etc/localtime', or to any other > value). So the code and the documentation disagree, and one or the > other needs to be fixed. > > Neither Solaris nor OpenBSD poll, so portable programs cannot assume > that polling occurs. (These are the only two other systems I checked.) It turns out that polling was implemented in response to this user bug report: <https://sourceware.org/bugzilla/show_bug.cgi?id=154> I must say that the use case in the bug report is not unreasonable. Thanks, Florian
On 4/10/19 7:57 AM, Florian Weimer wrote: > * Paul Eggert: > >> On 2/18/19 1:37 AM, Florian Weimer wrote: >>> I thought this polling (for reloading /etc/localtime if it has changed) >>> was a feature. I'm not sure if we can remove it. >> >> No such polling is documented. On the contrary, the glibc manual says >> that an unset TZ is treated as if it were set to >> ':/etc/localtime'. There is no polling when you set >> TZ=':/etc/localtime' (or to TZ='/etc/localtime', or to any other >> value). So the code and the documentation disagree, and one or the >> other needs to be fixed. >> >> Neither Solaris nor OpenBSD poll, so portable programs cannot assume >> that polling occurs. (These are the only two other systems I checked.) > > It turns out that polling was implemented in response to this user bug > report: > > <https://sourceware.org/bugzilla/show_bug.cgi?id=154> > > I must say that the use case in the bug report is not unreasonable. I agree that today we have *millions* of people travelling through timezones and they may be adjusting localtime on their laptop continuously and the timezone should be updated and reflected in the logs. I think this continues to show the following trend in expectation: * APIs are MT-safe. * APIs dynamically respond to changing context. - Location. - Network. - Language. Few if any developers are asking us to make APIs data read-once and immutable after reading. Timezone information is updated on average a dozen times a year, that's a once-a-month restart that's required to get this data if we don't internally reload that data. I think there *have* to be ways in which this data can be dynamically updated without a huge performance penalty. I find the following arguments weak: (1) Solaris nor OpenBSD Poll. (2) Portable programs will have to poll. The reason I find them weak is because they do not restrict us from doing the best we can for our developers on GNU/Linux. We can offer polling by default, and users who only care about GNU/Linux can write less code, and save on the maintenance of doing polling correctly (or having to pay the full cost of tzset()). $0.02.
On 4/10/19 6:18 AM, Carlos O'Donell wrote: > I think there *have* to be ways in which this data can be dynamically > updated without a huge performance penalty. What ways are those, though? It's not like the penalty is zero. One possible compromise would be for localtime to poll, and for localtime_r to not poll (you must call tzset to poll), and tell people "if you want performance, use localtime_r". That might hit a sweet spot of doing polling for naive programs where performance matters, and giving higher performance for applications that care about timestamp performance.
On 4/11/19 1:51 PM, Paul Eggert wrote: > On 4/10/19 6:18 AM, Carlos O'Donell wrote: >> I think there *have* to be ways in which this data can be dynamically >> updated without a huge performance penalty. > > What ways are those, though? It's not like the penalty is zero. "without huge performance penalty" ;-) Without threads you have to stat at some point to notice the file updated. Cost. You have to use RCU and a pointer to the binary data. Cost. I also think we need to do something similar for locale data, and the NSS reloading that DJ is looking at. In total that's 4 interfaces which could use a dynamic update/commit style mechanism in glibc. I'm going to go look and see if anyone has best practice for this in C. The alternative: (a) Check for update? Provide a function that tells the user if the on-disk data has changed. e.g tzdb_changed(), locale_changed(). The function lets the user control the cost of the stat and when it happens, but isolates them from the implemetnation. (b) Commit update. Provide a function to do the thread-safe update, which the user can execute whenever they want e.g. tzset(), locale_reload(). The function lets the user control the cost of the reload and when it happens, and again isolates them from the implementation. You still need to have something like RCU or an atomic protocol to make the update thread-safe. That cost never goes away, but it's minor compared to when to check and when to commit the update. > One possible compromise would be for localtime to poll, and for > localtime_r to not poll (you must call tzset to poll), and tell people > "if you want performance, use localtime_r". That might hit a sweet spot > of doing polling for naive programs where performance matters, and > giving higher performance for applications that care about timestamp > performance. If users have to make code changes it's better to just add a new api for "Check for update?" and "Do the update!".
On 4/12/19 10:07 AM, Carlos O'Donell wrote: >> One possible compromise would be for localtime to poll, and for >> localtime_r to not poll... > > If users have to make code changes it's better to just add a new api > for "Check for update?" and "Do the update!". If we add that API, will we also change localtime() to stop checking for and doing updates in the (undocumented) cases where localtime checks for and does them now? And if not, how will programs arrange for localtime to be faster? This would seem to be the key question.
On 4/12/19 2:41 PM, Paul Eggert wrote: > On 4/12/19 10:07 AM, Carlos O'Donell wrote: >>> One possible compromise would be for localtime to poll, and for >>> localtime_r to not poll... >> >> If users have to make code changes it's better to just add a new api >> for "Check for update?" and "Do the update!". > > If we add that API, will we also change localtime() to stop checking for > and doing updates in the (undocumented) cases where localtime checks for > and does them now? And if not, how will programs arrange for localtime > to be faster? This would seem to be the key question. Sure! I would be on board with changing localtime() to stop checking for and doing updates (unless someone has a convincing argument not to do this).
* Carlos O'Donell: > On 4/12/19 2:41 PM, Paul Eggert wrote: >> On 4/12/19 10:07 AM, Carlos O'Donell wrote: >>>> One possible compromise would be for localtime to poll, and for >>>> localtime_r to not poll... >>> If users have to make code changes it's better to just add a new >>> api >>> for "Check for update?" and "Do the update!". >> >> If we add that API, will we also change localtime() to stop checking for >> and doing updates in the (undocumented) cases where localtime checks for >> and does them now? And if not, how will programs arrange for localtime >> to be faster? This would seem to be the key question. > Sure! I would be on board with changing localtime() to stop checking > for and doing updates (unless someone has a convincing argument not to > do this). I want to make localtime the canonical interface, using thread-local storage. Thanks, Florian
diff --git a/ChangeLog b/ChangeLog index 6f1d967f62..831d89ba8f 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,16 @@ +2019-02-11 Paul Eggert <eggert@cs.ucla.edu> + + Fix multiple minor tzset glitches [BZ #24004] + Avoid polling if TZ is unset. + Consistently ignore leading ":" in TZ as per documentation. + Treat TZ="" as Universal Time without leap seconds, + for consistency with upstream. + * time/tzset.c (tzset_internal): Check for null TZ first. + Do not substitute "Universal" for ""; this is __tzfile_read’s + job and we were doing it incorrectly as we treated TZ="" + differently from TZ=":". + * time/tzfile.c (__tzfile_read): Simplify, as FILE cannot be null. + 2019-02-11 Paul A. Clarke <pc@us.ibm.com> * sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrtf): diff --git a/NEWS b/NEWS index 0a3b6c7a5a..87de5030c1 100644 --- a/NEWS +++ b/NEWS @@ -22,6 +22,13 @@ Deprecated and removed features, and other changes affecting compatibility: definitions in libc will be used automatically, which have been available since glibc 2.17. +* If the TZ environment variable is unset, functions like localtime no + longer stat the file /etc/localtime if the file has already been + consulted. Instead, they reuse the cached timezone info, just as they + would with TZ="/etc/localtime". If TZ="", these functions now omit leap + seconds even on unusual installations where TZ="Universal" identifies a + TZif file with leap seconds; this is for consistency with tzcode and BSD. + Changes to build and runtime requirements: * GCC 6.2 or later is required to build the GNU C Library. diff --git a/time/tzfile.c b/time/tzfile.c index 7229ed93b7..db66f22a4e 100644 --- a/time/tzfile.c +++ b/time/tzfile.c @@ -115,10 +115,7 @@ __tzfile_read (const char *file, size_t extra, char **extrap) __use_tzfile = 0; - if (file == NULL) - /* No user specification; use the site-wide default. */ - file = TZDEFAULT; - else if (*file == '\0') + if (*file == '\0') /* User specified the empty string; use UTC with no leap seconds. */ goto ret_free_transitions; else diff --git a/time/tzset.c b/time/tzset.c index 307eb651ad..1903ef864d 100644 --- a/time/tzset.c +++ b/time/tzset.c @@ -375,25 +375,22 @@ tzset_internal (int always) /* Examine the TZ environment variable. */ tz = getenv ("TZ"); - if (tz && *tz == '\0') - /* User specified the empty string; use UTC explicitly. */ - tz = "Universal"; + + if (tz == NULL) + /* No user specification; use the site-wide default. */ + tz = TZDEFAULT; /* A leading colon means "implementation defined syntax". We ignore the colon and always use the same algorithm: try a data file, and if none exists parse the 1003.1 syntax. */ - if (tz && *tz == ':') + if (*tz == ':') ++tz; /* Check whether the value changed since the last run. */ - if (old_tz != NULL && tz != NULL && strcmp (tz, old_tz) == 0) + if (old_tz != NULL && strcmp (tz, old_tz) == 0) /* No change, simply return. */ return; - if (tz == NULL) - /* No user specification; use the site-wide default. */ - tz = TZDEFAULT; - tz_rules[0].name = NULL; tz_rules[1].name = NULL;