Message ID | 5717B2F4.9050105@starleaf.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 112569 invoked by alias); 20 Apr 2016 16:49:15 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 112551 invoked by uid 89); 20 Apr 2016 16:49:14 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=ensured X-HELO: mail-wm0-f48.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:to:from:subject:message-id:date:user-agent :mime-version:content-transfer-encoding; bh=0qqbgExHfK6ADMbElICSfhnFZfX+4R8dsTLuFPI3z/8=; b=QMCNhOCn6FmMIOp5vE1C6pJKBi+bywS3EGBfaJP9SYbKW8DcPkMYSzE6ImSyi91Jza +Bd30NqJZs9SQoJMXCq0TGZFo/GNojKfhYWE9aI1Kkg2RJMjPwxF8TtciNZ+aTYEw5wK YcbOzNDQtSFvnuWRH3Hq8OW26JJ39aHKWITRWcpWHPsBj7ix0CrqVEKraR4Kv5gHuNpX lBZU4PEZw3gpb5IZxVmOYd6txwGtaG7bAgf0sLQ4po2Rr8vH0sTg1ykLl1UEvuR90qxd Ej/Oprc6XW3OzWmt1a3vRy73wuuL8c9M0jKkje6nXap4Zq3qnYcvjXVfQkJYrslFli/5 JeuA== X-Gm-Message-State: AOPr4FXsDS3bMw45Gygz2vaiXpd8ZcHlhRnfJonfaqX2/TXzhLXVczXPwGgoZq2ZPyL1zxib X-Received: by 10.28.150.193 with SMTP id y184mr11633199wmd.41.1461170941911; Wed, 20 Apr 2016 09:49:01 -0700 (PDT) To: GNU C Library <libc-alpha@sourceware.org> From: Mark Thompson <mark.thompson@starleaf.com> Subject: [PATCH] Don't divide by zero when trying to destroy an uninitialised barrier. Message-ID: <5717B2F4.9050105@starleaf.com> Date: Wed, 20 Apr 2016 17:48:52 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit |
Commit Message
Mark Thompson
April 20, 2016, 4:48 p.m. UTC
--- nptl/pthread_barrier_destroy.c | 9 +++++++++ 1 file changed, 9 insertions(+)
Comments
On 20/04/16 17:48, Mark Thompson wrote: > --- > nptl/pthread_barrier_destroy.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/nptl/pthread_barrier_destroy.c b/nptl/pthread_barrier_destroy.c > index 92d2027..d114084 100644 > --- a/nptl/pthread_barrier_destroy.c > +++ b/nptl/pthread_barrier_destroy.c > @@ -36,6 +36,15 @@ pthread_barrier_destroy (pthread_barrier_t *barrier) > they have exited as well. To get the notification, pretend that we have > reached the reset threshold. */ > unsigned int count = bar->count; > + > + /* For an initialised barrier, count must be greater than zero here. An > + uninitialised barrier may still have zero, however, and in this case it is > + preferable to fail immediately rather than to invoke undefined behaviour > + by dividing by zero on the next line. (POSIX allows the implementation to > + diagnose invalid state and return EINVAL in this case.) */ > + if (__glibc_unlikely (count == 0)) > + return EINVAL; > + this case is undefined behaviour in posix, and i think the best way to handle that is crashing. (because no behaviour can be portably relied upon) nowadays posix says "The [EINVAL] error for an uninitialized barrier attributes object is removed; this condition results in undefined behavior." > unsigned int max_in_before_reset = BARRIER_IN_THRESHOLD > - BARRIER_IN_THRESHOLD % count; > /* Relaxed MO sufficient because the program must have ensured that all >
On 20/04/16 18:03, Szabolcs Nagy wrote: > On 20/04/16 17:48, Mark Thompson wrote: >> --- >> nptl/pthread_barrier_destroy.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/nptl/pthread_barrier_destroy.c b/nptl/pthread_barrier_destroy.c >> index 92d2027..d114084 100644 >> --- a/nptl/pthread_barrier_destroy.c >> +++ b/nptl/pthread_barrier_destroy.c >> @@ -36,6 +36,15 @@ pthread_barrier_destroy (pthread_barrier_t *barrier) >> they have exited as well. To get the notification, pretend that we have >> reached the reset threshold. */ >> unsigned int count = bar->count; >> + >> + /* For an initialised barrier, count must be greater than zero here. An >> + uninitialised barrier may still have zero, however, and in this case it is >> + preferable to fail immediately rather than to invoke undefined behaviour >> + by dividing by zero on the next line. (POSIX allows the implementation to >> + diagnose invalid state and return EINVAL in this case.) */ >> + if (__glibc_unlikely (count == 0)) >> + return EINVAL; >> + > > this case is undefined behaviour in posix, and > i think the best way to handle that is crashing. > (because no behaviour can be portably relied upon) The undefined behaviour is not necessarily crashing - systems which do not trap on divide by zero (such as aarch64) will do something else, such as returning success or hanging forever. Would you prefer an abort() be added to make the behavior consistent? > nowadays posix says > "The [EINVAL] error for an uninitialized barrier > attributes object is removed; this condition > results in undefined behavior." It also says: "If an implementation detects that the value specified by the barrier argument to pthread_barrier_destroy() does not refer to an initialized barrier object, it is recommended that the function should fail and report an [EINVAL] error." (<http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_barrier_destroy.html>)
On 20-04-2016 16:16, Mark Thompson wrote: > On 20/04/16 18:03, Szabolcs Nagy wrote: >> On 20/04/16 17:48, Mark Thompson wrote: >>> --- >>> nptl/pthread_barrier_destroy.c | 9 +++++++++ >>> 1 file changed, 9 insertions(+) >>> >>> diff --git a/nptl/pthread_barrier_destroy.c b/nptl/pthread_barrier_destroy.c >>> index 92d2027..d114084 100644 >>> --- a/nptl/pthread_barrier_destroy.c >>> +++ b/nptl/pthread_barrier_destroy.c >>> @@ -36,6 +36,15 @@ pthread_barrier_destroy (pthread_barrier_t *barrier) >>> they have exited as well. To get the notification, pretend that we have >>> reached the reset threshold. */ >>> unsigned int count = bar->count; >>> + >>> + /* For an initialised barrier, count must be greater than zero here. An >>> + uninitialised barrier may still have zero, however, and in this case it is >>> + preferable to fail immediately rather than to invoke undefined behaviour >>> + by dividing by zero on the next line. (POSIX allows the implementation to >>> + diagnose invalid state and return EINVAL in this case.) */ >>> + if (__glibc_unlikely (count == 0)) >>> + return EINVAL; >>> + >> >> this case is undefined behaviour in posix, and >> i think the best way to handle that is crashing. >> (because no behaviour can be portably relied upon) > > The undefined behaviour is not necessarily crashing - systems which do not trap on divide by zero (such as aarch64) will do something else, such as returning success or hanging forever. Would you prefer an abort() be added to make the behavior consistent? > >> nowadays posix says >> "The [EINVAL] error for an uninitialized barrier >> attributes object is removed; this condition >> results in undefined behavior." > > It also says: > > "If an implementation detects that the value specified by the barrier argument to pthread_barrier_destroy() does not refer to an initialized barrier object, it is recommended that the function should fail and report an [EINVAL] error." > > (<http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_barrier_destroy.html>) > I do not see a compelling reason to not return EINVAL if the UB could be detected and if POSIX stated this behaviour is recommended.
On Wed, 2016-04-20 at 20:16 +0100, Mark Thompson wrote: > On 20/04/16 18:03, Szabolcs Nagy wrote: > > On 20/04/16 17:48, Mark Thompson wrote: > >> --- > >> nptl/pthread_barrier_destroy.c | 9 +++++++++ > >> 1 file changed, 9 insertions(+) > >> > >> diff --git a/nptl/pthread_barrier_destroy.c b/nptl/pthread_barrier_destroy.c > >> index 92d2027..d114084 100644 > >> --- a/nptl/pthread_barrier_destroy.c > >> +++ b/nptl/pthread_barrier_destroy.c > >> @@ -36,6 +36,15 @@ pthread_barrier_destroy (pthread_barrier_t *barrier) > >> they have exited as well. To get the notification, pretend that we have > >> reached the reset threshold. */ > >> unsigned int count = bar->count; > >> + > >> + /* For an initialised barrier, count must be greater than zero here. An > >> + uninitialised barrier may still have zero, however, and in this case it is > >> + preferable to fail immediately rather than to invoke undefined behaviour > >> + by dividing by zero on the next line. (POSIX allows the implementation to > >> + diagnose invalid state and return EINVAL in this case.) */ > >> + if (__glibc_unlikely (count == 0)) > >> + return EINVAL; > >> + > > > > this case is undefined behaviour in posix, and > > i think the best way to handle that is crashing. > > (because no behaviour can be portably relied upon) > > The undefined behaviour is not necessarily crashing - systems which do not trap on divide by zero (such as aarch64) will do something else, such as returning success or hanging forever. Would you prefer an abort() be added to make the behavior consistent? IMO, abort() would be better than returning EINVAL. See https://sourceware.org/glibc/wiki/Style_and_Conventions#Bugs_in_the_user_program > > nowadays posix says > > "The [EINVAL] error for an uninitialized barrier > > attributes object is removed; this condition > > results in undefined behavior." > > It also says: > > "If an implementation detects that the value specified by the barrier argument to pthread_barrier_destroy() does not refer to an initialized barrier object, it is recommended that the function should fail and report an [EINVAL] error." > > (<http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_barrier_destroy.html>) But there are no error conditions listed for pthread_barrier_destroy, which I read as no errors being allowed in a correct program. Therefore, this is really UB, and we should call abort().
On 04/20/2016 09:46 PM, Adhemerval Zanella wrote: > I do not see a compelling reason to not return EINVAL if the UB > could be detected and if POSIX stated this behaviour is recommended. It would result in silent loss of synchronization if the return value is not checked. Such bugs are difficult to track down. Florian
On 26/04/2016 11:38, Florian Weimer wrote: > On 04/20/2016 09:46 PM, Adhemerval Zanella wrote: >> I do not see a compelling reason to not return EINVAL if the UB >> could be detected and if POSIX stated this behaviour is recommended. > > It would result in silent loss of synchronization if the return value is not checked. Such bugs are difficult to track down. > > Florian But the check is user responsibility and getting such error means the program is doing something fuzzy. But thinking twice seems that abort in such cases seems a better alternative, it gives the user a more straightforward indication he should check his code.
On 26 Apr 2016 11:44, Adhemerval Zanella wrote: > On 26/04/2016 11:38, Florian Weimer wrote: > > On 04/20/2016 09:46 PM, Adhemerval Zanella wrote: > >> I do not see a compelling reason to not return EINVAL if the UB > >> could be detected and if POSIX stated this behaviour is recommended. > > > > It would result in silent loss of synchronization if the return value is not checked. Such bugs are difficult to track down. > > > > Florian > > But the check is user responsibility and getting such error means the > program is doing something fuzzy. > > But thinking twice seems that abort in such cases seems a better > alternative, it gives the user a more straightforward indication > he should check his code. in principal, i tend to agree with you. but as an active counter-point, i think we can agree that the heap corruption checks which trigger aborts have improved software in the wider ecosystem. ... malloc(): memory corruption (fast) ... whether we'll see as much use in this API, it's hard to say. but if we already have the code in place to detect a bad/invalid scenario, then an abort doesn't seem bad to me. when you start adding more checks though, then due consideration to overhead/fast paths make sense. -mike
On Tue, 2016-04-26 at 11:44 -0300, Adhemerval Zanella wrote: > > On 26/04/2016 11:38, Florian Weimer wrote: > > On 04/20/2016 09:46 PM, Adhemerval Zanella wrote: > >> I do not see a compelling reason to not return EINVAL if the UB > >> could be detected and if POSIX stated this behaviour is recommended. > > > > It would result in silent loss of synchronization if the return value is not checked. Such bugs are difficult to track down. > > > > Florian > > But the check is user responsibility and getting such error means the > program is doing something fuzzy. EINVAL is not listed as an error code, so there is no user responsibility.
On 26/04/2016 15:03, Torvald Riegel wrote: > On Tue, 2016-04-26 at 11:44 -0300, Adhemerval Zanella wrote: >> >> On 26/04/2016 11:38, Florian Weimer wrote: >>> On 04/20/2016 09:46 PM, Adhemerval Zanella wrote: >>>> I do not see a compelling reason to not return EINVAL if the UB >>>> could be detected and if POSIX stated this behaviour is recommended. >>> >>> It would result in silent loss of synchronization if the return value is not checked. Such bugs are difficult to track down. >>> >>> Florian >> >> But the check is user responsibility and getting such error means the >> program is doing something fuzzy. > > EINVAL is not listed as an error code, so there is no user > responsibility. > Alright, so abort seems the best solution then.
On 04/21/2016 12:23 PM, Torvald Riegel wrote: > On Wed, 2016-04-20 at 20:16 +0100, Mark Thompson wrote: >> On 20/04/16 18:03, Szabolcs Nagy wrote: >>> On 20/04/16 17:48, Mark Thompson wrote: >>>> --- >>>> nptl/pthread_barrier_destroy.c | 9 +++++++++ >>>> 1 file changed, 9 insertions(+) >>>> >>>> diff --git a/nptl/pthread_barrier_destroy.c b/nptl/pthread_barrier_destroy.c >>>> index 92d2027..d114084 100644 >>>> --- a/nptl/pthread_barrier_destroy.c >>>> +++ b/nptl/pthread_barrier_destroy.c >>>> @@ -36,6 +36,15 @@ pthread_barrier_destroy (pthread_barrier_t *barrier) >>>> they have exited as well. To get the notification, pretend that we have >>>> reached the reset threshold. */ >>>> unsigned int count = bar->count; >>>> + >>>> + /* For an initialised barrier, count must be greater than zero here. An >>>> + uninitialised barrier may still have zero, however, and in this case it is >>>> + preferable to fail immediately rather than to invoke undefined behaviour >>>> + by dividing by zero on the next line. (POSIX allows the implementation to >>>> + diagnose invalid state and return EINVAL in this case.) */ >>>> + if (__glibc_unlikely (count == 0)) >>>> + return EINVAL; >>>> + >>> >>> this case is undefined behaviour in posix, and >>> i think the best way to handle that is crashing. >>> (because no behaviour can be portably relied upon) >> >> The undefined behaviour is not necessarily crashing - systems which >> do not trap on divide by zero (such as aarch64) will do something >> else, such as returning success or hanging forever. Would you >> prefer an abort() be added to make the behavior consistent? > > IMO, abort() would be better than returning EINVAL. See > https://sourceware.org/glibc/wiki/Style_and_Conventions#Bugs_in_the_user_program Agreed. It's easy to detect. We should abort().
diff --git a/nptl/pthread_barrier_destroy.c b/nptl/pthread_barrier_destroy.c index 92d2027..d114084 100644 --- a/nptl/pthread_barrier_destroy.c +++ b/nptl/pthread_barrier_destroy.c @@ -36,6 +36,15 @@ pthread_barrier_destroy (pthread_barrier_t *barrier) they have exited as well. To get the notification, pretend that we have reached the reset threshold. */ unsigned int count = bar->count; + + /* For an initialised barrier, count must be greater than zero here. An + uninitialised barrier may still have zero, however, and in this case it is + preferable to fail immediately rather than to invoke undefined behaviour + by dividing by zero on the next line. (POSIX allows the implementation to + diagnose invalid state and return EINVAL in this case.) */ + if (__glibc_unlikely (count == 0)) + return EINVAL; + unsigned int max_in_before_reset = BARRIER_IN_THRESHOLD - BARRIER_IN_THRESHOLD % count; /* Relaxed MO sufficient because the program must have ensured that all