[v5] Fix time/tst-cpuclock1 intermitent failures

Message ID 20200323172008.11370-1-lamm@linux.ibm.com
State Superseded
Headers
Series [v5] Fix time/tst-cpuclock1 intermitent failures |

Commit Message

Lucas A. M. Magalhaes March 23, 2020, 5:20 p.m. UTC
  This test fails intermittently in systems with heavy load as
CLOCK_PROCESS_CPUTIME_ID is subject to scheduler pressure.  Thus the
test boundaries were relaxed to keep it from failing on such systems.

A refactor of the spent time checking was made with some support
functions.  With the advantage to representing time jitter in percent
of the target.

The values used by the test boundaries are all empirical.

---

Hi,

changes on V5:
	- add tests for support_timespec_check_in_range
	- fix support_timespec_normalize
	- add comments
	- fix style

changes on V4:
	- move functions to support/timespec.c
	- simplify functions

changes on V3:
	- refactor support functions
	- use existing timespec-sub function

changes on V2:
	- Add support functions
---
 support/Makefile       |   1 +
 support/timespec.c     |  36 ++++++++++
 support/timespec.h     |   6 ++
 support/tst-timespec.c | 147 +++++++++++++++++++++++++++++++++++++++++
 time/tst-cpuclock1.c   |  48 +++++---------
 5 files changed, 208 insertions(+), 30 deletions(-)
 create mode 100644 support/tst-timespec.c
  

Comments

Carlos O'Donell March 23, 2020, 9:06 p.m. UTC | #1
On 3/23/20 1:20 PM, Lucas A. M. Magalhaes via Libc-alpha wrote:
> This test fails intermittently in systems with heavy load as
> CLOCK_PROCESS_CPUTIME_ID is subject to scheduler pressure.  Thus the
> test boundaries were relaxed to keep it from failing on such systems.
> 
> A refactor of the spent time checking was made with some support
> functions.  With the advantage to representing time jitter in percent
> of the target.
> 
> The values used by the test boundaries are all empirical.

Suggestions below.

Looking forward to v6.

> ---
> 
> Hi,
> 
> changes on V5:
> 	- add tests for support_timespec_check_in_range
> 	- fix support_timespec_normalize
> 	- add comments
> 	- fix style
> 
> changes on V4:
> 	- move functions to support/timespec.c
> 	- simplify functions
> 
> changes on V3:
> 	- refactor support functions
> 	- use existing timespec-sub function
> 
> changes on V2:
> 	- Add support functions
> ---
>  support/Makefile       |   1 +
>  support/timespec.c     |  36 ++++++++++
>  support/timespec.h     |   6 ++
>  support/tst-timespec.c | 147 +++++++++++++++++++++++++++++++++++++++++
>  time/tst-cpuclock1.c   |  48 +++++---------
>  5 files changed, 208 insertions(+), 30 deletions(-)
>  create mode 100644 support/tst-timespec.c
> 
> diff --git a/support/Makefile b/support/Makefile
> index 6e38b87ebe..cacaac96a5 100644
> --- a/support/Makefile
> +++ b/support/Makefile
> @@ -233,6 +233,7 @@ tests = \
>    tst-test_compare \
>    tst-test_compare_blob \
>    tst-test_compare_string \
> +  tst-timespec \

OK.

>    tst-xreadlink \
>    tst-xsigstack \
>  
> diff --git a/support/timespec.c b/support/timespec.c
> index ea6b947546..53e07566b6 100644
> --- a/support/timespec.c
> +++ b/support/timespec.c
> @@ -19,6 +19,7 @@
>  #include <support/timespec.h>
>  #include <stdio.h>
>  #include <stdint.h>
> +#include <assert.h>
>  
>  void
>  test_timespec_before_impl (const char *file, int line,
> @@ -57,3 +58,38 @@ test_timespec_equal_or_after_impl (const char *file, int line,
>  	    (intmax_t) diff.tv_sec, (intmax_t) diff.tv_nsec);
>    }
>  }
> +
> +/* Returns t normalized timespec with .tv_nsec < TIMESPEC_HZ
> +   and the overflows added to .tv_sec.  */
> +struct timespec
> +support_timespec_normalize (struct timespec t)

Please don't use single variable names. Use "from" or "raw".

> +{
> +  struct timespec r;
> +  r.tv_sec = t.tv_sec + (t.tv_nsec / TIMESPEC_HZ);
> +  r.tv_nsec = t.tv_nsec % TIMESPEC_HZ;
> +  return r;
> +}

Likewise "r" could be "cooked" or "norm".

> +
> +/* Returns TRUE if the observed time is within the given percentage bounds of
> +the expected time, and FALSE otherwise.
> +For example the call
> +
> +support_timespec_check_in_range(expected, observed, .5, 1.2);
> +
> +will check if
> +
> +.5 <= observed/expected <= 1.2
> +
> +In other words it will check if observed time is within 50% to 120% of
> +the expected time.  */
> +int
> +support_timespec_check_in_range (struct timespec expected, struct timespec observed,
> +			      double lower_bound, double upper_bound)
> +{
> +  assert (upper_bound >= lower_bound);
> +  long expected_norm = expected.tv_sec * TIMESPEC_HZ + expected.tv_nsec;

This can cause overflow/underflow.

Please review timespec_add.

We should set this to a extreme value just like timepsec_add for both overflow/underflow.

> +  assert(expected_norm != 0);

Why can't expected_norm be zero?

If you have an abstract timespec you may want to check against that.

I would assert that all values are *positive* and write that into the comments
above.

> +  long observed_norm = observed.tv_sec * TIMESPEC_HZ + observed.tv_nsec;
> +  double ratio = (double)observed_norm / expected_norm;
> +  return (lower_bound <= ratio && ratio <= upper_bound);
> +}
> diff --git a/support/timespec.h b/support/timespec.h
> index c5852dfe75..9bd6942957 100644
> --- a/support/timespec.h
> +++ b/support/timespec.h
> @@ -48,6 +48,12 @@ void test_timespec_equal_or_after_impl (const char *file, int line,
>                                          const struct timespec left,
>                                          const struct timespec right);
>  
> +struct timespec support_timespec_normalize (struct timespec t);

Please avoid single character variable names.

> +
> +int support_timespec_check_in_range (struct timespec expected, struct timespec observed,
> +				  double lower_bound, double upper_bound);
> +
> +
>  /* Check that the timespec on the left represents a time before the
>     time on the right. */
>  #define TEST_TIMESPEC_BEFORE(left, right)                               \
> diff --git a/support/tst-timespec.c b/support/tst-timespec.c
> new file mode 100644
> index 0000000000..06d817a421
> --- /dev/null
> +++ b/support/tst-timespec.c
> @@ -0,0 +1,147 @@
> +/* Test for support_timespec_check_in_range function.

OK.

> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <support/timespec.h>
> +#include <support/check.h>
> +#include <limits.h>
> +
> +struct timespec_test_case
> +{
> +  struct timespec expected;
> +  struct timespec observed;
> +  double upper_bound;
> +  double lower_bound;
> +  int result;
> +};
> +
> +struct timespec_test_case test_cases[] = {
> +  // 0 - In range
> +  {.expected = { .tv_sec = 1, .tv_nsec = 0},
> +   .observed = {.tv_sec = 1, .tv_nsec = 0},
> +   .upper_bound = 1, .lower_bound = 1, .result = 1,
> +  },
> +  // 1 - Out of range
> +  {.expected = { .tv_sec = 1, .tv_nsec = 0},
> +   .observed = {.tv_sec = 2, .tv_nsec = 0},
> +   .upper_bound = 1, .lower_bound = 1, .result = 0,
> +  },
> +  // 2 - Upper Bound
> +  {.expected = { .tv_sec = 1, .tv_nsec = 0},
> +   .observed = {.tv_sec = 2, .tv_nsec = 0},
> +   .upper_bound = 2, .lower_bound = 1, .result = 1,
> +  },
> +  // 3 - Lower Bound
> +  {.expected = { .tv_sec = 1, .tv_nsec = 0},
> +   .observed = {.tv_sec = 0, .tv_nsec = 0},
> +   .upper_bound = 1, .lower_bound = 0, .result = 1,
> +  },
> +  // 4 - Out of range by nanosecs
> +  {.expected = { .tv_sec = 1, .tv_nsec = 0},
> +   .observed = {.tv_sec = 1, .tv_nsec = 500},
> +   .upper_bound = 1, .lower_bound = 1, .result = 0,
> +  },
> +  // 5 - In range by nanosecs
> +  {.expected = { .tv_sec = 1, .tv_nsec = 0},
> +   .observed = {.tv_sec = 1, .tv_nsec = 50000},
> +   .upper_bound = 1.3, .lower_bound = 1, .result = 1,
> +  },
> +  // 6 - Big nanosecs
> +  {.expected = { .tv_sec = 1, .tv_nsec = 0},
> +   .observed = {.tv_sec = 0, .tv_nsec = 4000000},
> +   .upper_bound = 1, .lower_bound = .001, .result = 1,
> +  },
> +  // 7 - In range Negative values
> +  {.expected = { .tv_sec = -1, .tv_nsec = 0},
> +   .observed = {.tv_sec = -1, .tv_nsec = 0},
> +   .upper_bound = 1, .lower_bound = 1, .result = 1,
> +  },
> +  // 8 - Out of range Negative values
> +  {.expected = { .tv_sec = -1, .tv_nsec = 0},
> +   .observed = {.tv_sec = -1, .tv_nsec = 0},
> +   .upper_bound = -1, .lower_bound = -1, .result = 0,
> +  },
> +  // 9 - Negative values with negative nanosecs
> +  {.expected = { .tv_sec = -1, .tv_nsec = 0},
> +   .observed = {.tv_sec = -1, .tv_nsec = -2000},
> +   .upper_bound = 1, .lower_bound = 1, .result = 0,
> +  },
> +  // 10 - Strict bounds
> +  {.expected = { .tv_sec = -1, .tv_nsec = 0},
> +   .observed = {.tv_sec = -1, .tv_nsec = -20000},
> +   .upper_bound = 1.00002, .lower_bound = 1.0000191, .result = 1,
> +  },
> +  // 11 - Strict bounds with loose upper bound
> +  {.expected = { .tv_sec = 1, .tv_nsec = 20000},
> +   .observed = {.tv_sec = 1, .tv_nsec = 30000},
> +   .upper_bound = 1.0000100000, .lower_bound = 1.0000099998, .result = 1,
> +  },
> +  // 12 - Strict bounds with loose lower bound
> +  {.expected = { .tv_sec = 1, .tv_nsec = 20000},
> +   .observed = {.tv_sec = 1, .tv_nsec = 30000},
> +   .upper_bound = 1.0000099999, .lower_bound = 1.00000999979, .result = 1,
> +  },
> +  // 13 - Strict bounds highest precision
> +  {.expected = { .tv_sec = 1, .tv_nsec = 20000},
> +   .observed = {.tv_sec = 1, .tv_nsec = 30000},
> +   .upper_bound = 1.00000999980001, .lower_bound = 1.00000999979999, .result = 1,
> +  },
> +  /* Maximum/Minimum long values  */
> +  // 14
> +  {.expected = { .tv_sec = LONG_MAX, .tv_nsec = LONG_MAX},
> +   .observed = {.tv_sec = LONG_MAX, .tv_nsec = LONG_MAX -1},
> +   .upper_bound = 1, .lower_bound = .9, .result = 1,
> +  },
> +  // 15
> +  {.expected = { .tv_sec = LONG_MAX, .tv_nsec = LONG_MAX},
> +   .observed = {.tv_sec = LONG_MAX, .tv_nsec = LONG_MAX},
> +   .upper_bound = 1, .lower_bound = 1, .result = 1,
> +  },
> +  // 16
> +  {.expected = { .tv_sec = LONG_MAX, .tv_nsec = LONG_MAX},
> +   .observed = {.tv_sec = LONG_MIN, .tv_nsec = LONG_MIN},
> +   .upper_bound = 1, .lower_bound = 1, .result = 0,
> +  },
> +  // 17
> +  {.expected = { .tv_sec = LONG_MIN, .tv_nsec = LONG_MIN},
> +   .observed = {.tv_sec = LONG_MIN, .tv_nsec = LONG_MIN},
> +   .upper_bound = 1, .lower_bound = 1, .result = 1,
> +  },
> +  // 18
> +  {.expected = { .tv_sec = LONG_MIN, .tv_nsec = LONG_MIN},
> +   .observed = {.tv_sec = LONG_MAX, .tv_nsec = LONG_MAX},
> +   .upper_bound = 1, .lower_bound = 1, .result = 0,
> +  }
> +};

Several of the tests above would cause overflow in the normalization.

I'll review them more closely in v6.

> +
> +static int
> +do_test (void)
> +{
> +  int i = 0;
> +
> +  int ntests = sizeof (test_cases) / sizeof (test_cases);
> +  for (i = 0; i < ntests; i++)
> +    {
> +      TEST_COMPARE (support_timespec_check_in_range
> +		    (test_cases[i].expected, test_cases[i].observed,
> +		     test_cases[i].lower_bound,
> +		     test_cases[i].upper_bound), test_cases[i].result);
> +    }
> +  return 0;
> +}

OK.

> +
> +#include <support/test-driver.c>
> diff --git a/time/tst-cpuclock1.c b/time/tst-cpuclock1.c
> index 0120906f23..fe9bb0a31e 100644
> --- a/time/tst-cpuclock1.c
> +++ b/time/tst-cpuclock1.c
> @@ -26,6 +26,7 @@
>  #include <signal.h>
>  #include <stdint.h>
>  #include <sys/wait.h>
> +#include <support/timespec.h>
>  
>  /* This function is intended to rack up both user and system time.  */
>  static void
> @@ -155,16 +156,11 @@ do_test (void)
>    printf ("live PID %d after sleep => %ju.%.9ju\n",
>  	  child, (uintmax_t) after.tv_sec, (uintmax_t) after.tv_nsec);
>  
> -  struct timespec diff = { .tv_sec = after.tv_sec - before.tv_sec,
> -			   .tv_nsec = after.tv_nsec - before.tv_nsec };
> -  if (diff.tv_nsec < 0)
> -    {
> -      --diff.tv_sec;
> -      diff.tv_nsec += 1000000000;
> -    }
> -  if (diff.tv_sec != 0
> -      || diff.tv_nsec > 600000000
> -      || diff.tv_nsec < 100000000)
> +  /* The bound values are empirically defined by testing this code over high cpu
> +     usage and different nice values.  */
> +  struct timespec diff = timespec_sub (support_timespec_normalize (after),
> +				       support_timespec_normalize (before));
> +  if (!support_timespec_check_in_range (sleeptime, diff, .0025,  1.3))

The value of 0.0025 doesn't seem correct, can you please confirm that?

This is supposed to be a 0.5s wait, and wait of 1ms doesn't seem correct.

>      {
>        printf ("before - after %ju.%.9ju outside reasonable range\n",
>  	      (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
> @@ -194,19 +190,14 @@ do_test (void)
>  	}
>        else
>  	{
> -	  struct timespec d = { .tv_sec = afterns.tv_sec - after.tv_sec,
> -				.tv_nsec = afterns.tv_nsec - after.tv_nsec };
> -	  if (d.tv_nsec < 0)
> -	    {
> -	      --d.tv_sec;
> -	      d.tv_nsec += 1000000000;
> -	    }
> -	  if (d.tv_sec > 0
> -	      || d.tv_nsec < sleeptime.tv_nsec
> -	      || d.tv_nsec > sleeptime.tv_nsec * 2)
> +        /* The bound values are empirically defined by testing this code over
> +           high cpu usage and different nice values.  */
> +	  diff = timespec_sub (support_timespec_normalize (afterns),
> +			       support_timespec_normalize (after));
> +	  if (!support_timespec_check_in_range (sleeptime, diff, .71, 1.6))

OK.

>  	    {
>  	      printf ("nanosleep time %ju.%.9ju outside reasonable range\n",
> -		      (uintmax_t) d.tv_sec, (uintmax_t) d.tv_nsec);
> +		      (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
>  	      result = 1;
>  	    }
>  	}
> @@ -240,15 +231,12 @@ do_test (void)
>    /* Should be close to 0.6.  */
>    printf ("dead PID %d => %ju.%.9ju\n",
>  	  child, (uintmax_t) dead.tv_sec, (uintmax_t) dead.tv_nsec);
> -
> -  diff.tv_sec = dead.tv_sec - after.tv_sec;
> -  diff.tv_nsec = dead.tv_nsec - after.tv_nsec;
> -  if (diff.tv_nsec < 0)
> -    {
> -      --diff.tv_sec;
> -      diff.tv_nsec += 1000000000;
> -    }
> -  if (diff.tv_sec != 0 || diff.tv_nsec > 200000000)
> +  /* The bound values are empirically defined by testing this code over high cpu
> +     usage and different nice values.  */
> +  diff = timespec_sub (support_timespec_normalize (dead),
> +		       support_timespec_normalize (after));
> +  sleeptime.tv_nsec = 100000000;
> +  if (!support_timespec_check_in_range (sleeptime, diff, .7, 1.6))

OK.

>      {
>        printf ("dead - after %ju.%.9ju outside reasonable range\n",
>  	      (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
>
  
Lucas A. M. Magalhaes March 24, 2020, 7:42 p.m. UTC | #2
Hi Carlos,

Thanks for the review.

Quoting Carlos O'Donell (2020-03-23 18:06:34)
> On 3/23/20 1:20 PM, Lucas A. M. Magalhaes via Libc-alpha wrote:
> > +
> > +/* Returns TRUE if the observed time is within the given percentage bounds of
> > +the expected time, and FALSE otherwise.
> > +For example the call
> > +
> > +support_timespec_check_in_range(expected, observed, .5, 1.2);
> > +
> > +will check if
> > +
> > +.5 <= observed/expected <= 1.2
> > +
> > +In other words it will check if observed time is within 50% to 120% of
> > +the expected time.  */
> > +int
> > +support_timespec_check_in_range (struct timespec expected, struct timespec observed,
> > +                           double lower_bound, double upper_bound)
> > +{
> > +  assert (upper_bound >= lower_bound);
> > +  long expected_norm = expected.tv_sec * TIMESPEC_HZ + expected.tv_nsec;
> 
> This can cause overflow/underflow.
> 
> Please review timespec_add.
> 
> We should set this to a extreme value just like timepsec_add for both overflow/underflow.
> 
> > +  assert(expected_norm != 0);
> 
> Why can't expected_norm be zero?
> 

It can't be zero because of the division below.  Do you have any suggestions on
this matter?

> If you have an abstract timespec you may want to check against that.
> 
> I would assert that all values are *positive* and write that into the comments
> above.
> 
> > +  long observed_norm = observed.tv_sec * TIMESPEC_HZ + observed.tv_nsec;
> > +  double ratio = (double)observed_norm / expected_norm;
> > +  return (lower_bound <= ratio && ratio <= upper_bound);
> > +}

[...]

> > +
> > +#include <support/test-driver.c>
> > diff --git a/time/tst-cpuclock1.c b/time/tst-cpuclock1.c
> > index 0120906f23..fe9bb0a31e 100644
> > --- a/time/tst-cpuclock1.c
> > +++ b/time/tst-cpuclock1.c
> > @@ -26,6 +26,7 @@
> >  #include <signal.h>
> >  #include <stdint.h>
> >  #include <sys/wait.h>
> > +#include <support/timespec.h>
> >  
> >  /* This function is intended to rack up both user and system time.  */
> >  static void
> > @@ -155,16 +156,11 @@ do_test (void)
> >    printf ("live PID %d after sleep => %ju.%.9ju\n",
> >         child, (uintmax_t) after.tv_sec, (uintmax_t) after.tv_nsec);
> >  
> > -  struct timespec diff = { .tv_sec = after.tv_sec - before.tv_sec,
> > -                        .tv_nsec = after.tv_nsec - before.tv_nsec };
> > -  if (diff.tv_nsec < 0)
> > -    {
> > -      --diff.tv_sec;
> > -      diff.tv_nsec += 1000000000;
> > -    }
> > -  if (diff.tv_sec != 0
> > -      || diff.tv_nsec > 600000000
> > -      || diff.tv_nsec < 100000000)
> > +  /* The bound values are empirically defined by testing this code over high cpu
> > +     usage and different nice values.  */
> > +  struct timespec diff = timespec_sub (support_timespec_normalize (after),
> > +                                    support_timespec_normalize (before));
> > +  if (!support_timespec_check_in_range (sleeptime, diff, .0025,  1.3))
> 
> The value of 0.0025 doesn't seem correct, can you please confirm that?
> 

I got values as low as 0,0008s for this. But I can be more restrict.  The
values lower than 0.1 are less than < 1% of my sample.  But these are the ones
bothering during cpu stress.

> This is supposed to be a 0.5s wait, and wait of 1ms doesn't seem correct.
> 

---
Lucas A. M. Magalhães
  
Carlos O'Donell March 31, 2020, 6:55 p.m. UTC | #3
On 3/24/20 3:42 PM, Lucas A. M. Magalhaes wrote:
> Hi Carlos,
> 
> Thanks for the review.
> 
> Quoting Carlos O'Donell (2020-03-23 18:06:34)
>> On 3/23/20 1:20 PM, Lucas A. M. Magalhaes via Libc-alpha wrote:
>>> +
>>> +/* Returns TRUE if the observed time is within the given percentage bounds of
>>> +the expected time, and FALSE otherwise.
>>> +For example the call
>>> +
>>> +support_timespec_check_in_range(expected, observed, .5, 1.2);
>>> +
>>> +will check if
>>> +
>>> +.5 <= observed/expected <= 1.2
>>> +
>>> +In other words it will check if observed time is within 50% to 120% of
>>> +the expected time.  */
>>> +int
>>> +support_timespec_check_in_range (struct timespec expected, struct timespec observed,
>>> +                           double lower_bound, double upper_bound)
>>> +{
>>> +  assert (upper_bound >= lower_bound);
>>> +  long expected_norm = expected.tv_sec * TIMESPEC_HZ + expected.tv_nsec;
>>
>> This can cause overflow/underflow.
>>
>> Please review timespec_add.
>>
>> We should set this to a extreme value just like timepsec_add for both overflow/underflow.
>>
>>> +  assert(expected_norm != 0);
>>
>> Why can't expected_norm be zero?
>>
> 
> It can't be zero because of the division below.  Do you have any suggestions on
> this matter?

Please add a comment explaining why.

>> If you have an abstract timespec you may want to check against that.
>>
>> I would assert that all values are *positive* and write that into the comments
>> above.
>>
>>> +  long observed_norm = observed.tv_sec * TIMESPEC_HZ + observed.tv_nsec;
>>> +  double ratio = (double)observed_norm / expected_norm;
>>> +  return (lower_bound <= ratio && ratio <= upper_bound);
>>> +}
> 
> [...]
> 
>>> +
>>> +#include <support/test-driver.c>
>>> diff --git a/time/tst-cpuclock1.c b/time/tst-cpuclock1.c
>>> index 0120906f23..fe9bb0a31e 100644
>>> --- a/time/tst-cpuclock1.c
>>> +++ b/time/tst-cpuclock1.c
>>> @@ -26,6 +26,7 @@
>>>  #include <signal.h>
>>>  #include <stdint.h>
>>>  #include <sys/wait.h>
>>> +#include <support/timespec.h>
>>>  
>>>  /* This function is intended to rack up both user and system time.  */
>>>  static void
>>> @@ -155,16 +156,11 @@ do_test (void)
>>>    printf ("live PID %d after sleep => %ju.%.9ju\n",
>>>         child, (uintmax_t) after.tv_sec, (uintmax_t) after.tv_nsec);
>>>  
>>> -  struct timespec diff = { .tv_sec = after.tv_sec - before.tv_sec,
>>> -                        .tv_nsec = after.tv_nsec - before.tv_nsec };
>>> -  if (diff.tv_nsec < 0)
>>> -    {
>>> -      --diff.tv_sec;
>>> -      diff.tv_nsec += 1000000000;
>>> -    }
>>> -  if (diff.tv_sec != 0
>>> -      || diff.tv_nsec > 600000000
>>> -      || diff.tv_nsec < 100000000)
>>> +  /* The bound values are empirically defined by testing this code over high cpu
>>> +     usage and different nice values.  */
>>> +  struct timespec diff = timespec_sub (support_timespec_normalize (after),
>>> +                                    support_timespec_normalize (before));
>>> +  if (!support_timespec_check_in_range (sleeptime, diff, .0025,  1.3))
>>
>> The value of 0.0025 doesn't seem correct, can you please confirm that?
>>
> 
> I got values as low as 0,0008s for this. But I can be more restrict.  The
> values lower than 0.1 are less than < 1% of my sample.  But these are the ones
> bothering during cpu stress.

These *have* to be kernel bugs, and we should not paper over kernel bugs. We should
still fail if the kernel has bugs otherwise we'll never get them fixed upstream.

We should make tests robust with respect to waiting for infinite events to happen.
For example when sleeping we are guaranteed to sleep at-least a certain
amount, and our check should be flexible that we slept at least that amount and
then some amount more, to take into account system load.

In summary: We are making the test robust against system load, not against kernel
defects (which should show up).
  

Patch

diff --git a/support/Makefile b/support/Makefile
index 6e38b87ebe..cacaac96a5 100644
--- a/support/Makefile
+++ b/support/Makefile
@@ -233,6 +233,7 @@  tests = \
   tst-test_compare \
   tst-test_compare_blob \
   tst-test_compare_string \
+  tst-timespec \
   tst-xreadlink \
   tst-xsigstack \
 
diff --git a/support/timespec.c b/support/timespec.c
index ea6b947546..53e07566b6 100644
--- a/support/timespec.c
+++ b/support/timespec.c
@@ -19,6 +19,7 @@ 
 #include <support/timespec.h>
 #include <stdio.h>
 #include <stdint.h>
+#include <assert.h>
 
 void
 test_timespec_before_impl (const char *file, int line,
@@ -57,3 +58,38 @@  test_timespec_equal_or_after_impl (const char *file, int line,
 	    (intmax_t) diff.tv_sec, (intmax_t) diff.tv_nsec);
   }
 }
+
+/* Returns t normalized timespec with .tv_nsec < TIMESPEC_HZ
+   and the overflows added to .tv_sec.  */
+struct timespec
+support_timespec_normalize (struct timespec t)
+{
+  struct timespec r;
+  r.tv_sec = t.tv_sec + (t.tv_nsec / TIMESPEC_HZ);
+  r.tv_nsec = t.tv_nsec % TIMESPEC_HZ;
+  return r;
+}
+
+/* Returns TRUE if the observed time is within the given percentage bounds of
+the expected time, and FALSE otherwise.
+For example the call
+
+support_timespec_check_in_range(expected, observed, .5, 1.2);
+
+will check if
+
+.5 <= observed/expected <= 1.2
+
+In other words it will check if observed time is within 50% to 120% of
+the expected time.  */
+int
+support_timespec_check_in_range (struct timespec expected, struct timespec observed,
+			      double lower_bound, double upper_bound)
+{
+  assert (upper_bound >= lower_bound);
+  long expected_norm = expected.tv_sec * TIMESPEC_HZ + expected.tv_nsec;
+  assert(expected_norm != 0);
+  long observed_norm = observed.tv_sec * TIMESPEC_HZ + observed.tv_nsec;
+  double ratio = (double)observed_norm / expected_norm;
+  return (lower_bound <= ratio && ratio <= upper_bound);
+}
diff --git a/support/timespec.h b/support/timespec.h
index c5852dfe75..9bd6942957 100644
--- a/support/timespec.h
+++ b/support/timespec.h
@@ -48,6 +48,12 @@  void test_timespec_equal_or_after_impl (const char *file, int line,
                                         const struct timespec left,
                                         const struct timespec right);
 
+struct timespec support_timespec_normalize (struct timespec t);
+
+int support_timespec_check_in_range (struct timespec expected, struct timespec observed,
+				  double lower_bound, double upper_bound);
+
+
 /* Check that the timespec on the left represents a time before the
    time on the right. */
 #define TEST_TIMESPEC_BEFORE(left, right)                               \
diff --git a/support/tst-timespec.c b/support/tst-timespec.c
new file mode 100644
index 0000000000..06d817a421
--- /dev/null
+++ b/support/tst-timespec.c
@@ -0,0 +1,147 @@ 
+/* Test for support_timespec_check_in_range function.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <support/timespec.h>
+#include <support/check.h>
+#include <limits.h>
+
+struct timespec_test_case
+{
+  struct timespec expected;
+  struct timespec observed;
+  double upper_bound;
+  double lower_bound;
+  int result;
+};
+
+struct timespec_test_case test_cases[] = {
+  // 0 - In range
+  {.expected = { .tv_sec = 1, .tv_nsec = 0},
+   .observed = {.tv_sec = 1, .tv_nsec = 0},
+   .upper_bound = 1, .lower_bound = 1, .result = 1,
+  },
+  // 1 - Out of range
+  {.expected = { .tv_sec = 1, .tv_nsec = 0},
+   .observed = {.tv_sec = 2, .tv_nsec = 0},
+   .upper_bound = 1, .lower_bound = 1, .result = 0,
+  },
+  // 2 - Upper Bound
+  {.expected = { .tv_sec = 1, .tv_nsec = 0},
+   .observed = {.tv_sec = 2, .tv_nsec = 0},
+   .upper_bound = 2, .lower_bound = 1, .result = 1,
+  },
+  // 3 - Lower Bound
+  {.expected = { .tv_sec = 1, .tv_nsec = 0},
+   .observed = {.tv_sec = 0, .tv_nsec = 0},
+   .upper_bound = 1, .lower_bound = 0, .result = 1,
+  },
+  // 4 - Out of range by nanosecs
+  {.expected = { .tv_sec = 1, .tv_nsec = 0},
+   .observed = {.tv_sec = 1, .tv_nsec = 500},
+   .upper_bound = 1, .lower_bound = 1, .result = 0,
+  },
+  // 5 - In range by nanosecs
+  {.expected = { .tv_sec = 1, .tv_nsec = 0},
+   .observed = {.tv_sec = 1, .tv_nsec = 50000},
+   .upper_bound = 1.3, .lower_bound = 1, .result = 1,
+  },
+  // 6 - Big nanosecs
+  {.expected = { .tv_sec = 1, .tv_nsec = 0},
+   .observed = {.tv_sec = 0, .tv_nsec = 4000000},
+   .upper_bound = 1, .lower_bound = .001, .result = 1,
+  },
+  // 7 - In range Negative values
+  {.expected = { .tv_sec = -1, .tv_nsec = 0},
+   .observed = {.tv_sec = -1, .tv_nsec = 0},
+   .upper_bound = 1, .lower_bound = 1, .result = 1,
+  },
+  // 8 - Out of range Negative values
+  {.expected = { .tv_sec = -1, .tv_nsec = 0},
+   .observed = {.tv_sec = -1, .tv_nsec = 0},
+   .upper_bound = -1, .lower_bound = -1, .result = 0,
+  },
+  // 9 - Negative values with negative nanosecs
+  {.expected = { .tv_sec = -1, .tv_nsec = 0},
+   .observed = {.tv_sec = -1, .tv_nsec = -2000},
+   .upper_bound = 1, .lower_bound = 1, .result = 0,
+  },
+  // 10 - Strict bounds
+  {.expected = { .tv_sec = -1, .tv_nsec = 0},
+   .observed = {.tv_sec = -1, .tv_nsec = -20000},
+   .upper_bound = 1.00002, .lower_bound = 1.0000191, .result = 1,
+  },
+  // 11 - Strict bounds with loose upper bound
+  {.expected = { .tv_sec = 1, .tv_nsec = 20000},
+   .observed = {.tv_sec = 1, .tv_nsec = 30000},
+   .upper_bound = 1.0000100000, .lower_bound = 1.0000099998, .result = 1,
+  },
+  // 12 - Strict bounds with loose lower bound
+  {.expected = { .tv_sec = 1, .tv_nsec = 20000},
+   .observed = {.tv_sec = 1, .tv_nsec = 30000},
+   .upper_bound = 1.0000099999, .lower_bound = 1.00000999979, .result = 1,
+  },
+  // 13 - Strict bounds highest precision
+  {.expected = { .tv_sec = 1, .tv_nsec = 20000},
+   .observed = {.tv_sec = 1, .tv_nsec = 30000},
+   .upper_bound = 1.00000999980001, .lower_bound = 1.00000999979999, .result = 1,
+  },
+  /* Maximum/Minimum long values  */
+  // 14
+  {.expected = { .tv_sec = LONG_MAX, .tv_nsec = LONG_MAX},
+   .observed = {.tv_sec = LONG_MAX, .tv_nsec = LONG_MAX -1},
+   .upper_bound = 1, .lower_bound = .9, .result = 1,
+  },
+  // 15
+  {.expected = { .tv_sec = LONG_MAX, .tv_nsec = LONG_MAX},
+   .observed = {.tv_sec = LONG_MAX, .tv_nsec = LONG_MAX},
+   .upper_bound = 1, .lower_bound = 1, .result = 1,
+  },
+  // 16
+  {.expected = { .tv_sec = LONG_MAX, .tv_nsec = LONG_MAX},
+   .observed = {.tv_sec = LONG_MIN, .tv_nsec = LONG_MIN},
+   .upper_bound = 1, .lower_bound = 1, .result = 0,
+  },
+  // 17
+  {.expected = { .tv_sec = LONG_MIN, .tv_nsec = LONG_MIN},
+   .observed = {.tv_sec = LONG_MIN, .tv_nsec = LONG_MIN},
+   .upper_bound = 1, .lower_bound = 1, .result = 1,
+  },
+  // 18
+  {.expected = { .tv_sec = LONG_MIN, .tv_nsec = LONG_MIN},
+   .observed = {.tv_sec = LONG_MAX, .tv_nsec = LONG_MAX},
+   .upper_bound = 1, .lower_bound = 1, .result = 0,
+  }
+};
+
+static int
+do_test (void)
+{
+  int i = 0;
+
+  int ntests = sizeof (test_cases) / sizeof (test_cases);
+  for (i = 0; i < ntests; i++)
+    {
+      TEST_COMPARE (support_timespec_check_in_range
+		    (test_cases[i].expected, test_cases[i].observed,
+		     test_cases[i].lower_bound,
+		     test_cases[i].upper_bound), test_cases[i].result);
+    }
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/time/tst-cpuclock1.c b/time/tst-cpuclock1.c
index 0120906f23..fe9bb0a31e 100644
--- a/time/tst-cpuclock1.c
+++ b/time/tst-cpuclock1.c
@@ -26,6 +26,7 @@ 
 #include <signal.h>
 #include <stdint.h>
 #include <sys/wait.h>
+#include <support/timespec.h>
 
 /* This function is intended to rack up both user and system time.  */
 static void
@@ -155,16 +156,11 @@  do_test (void)
   printf ("live PID %d after sleep => %ju.%.9ju\n",
 	  child, (uintmax_t) after.tv_sec, (uintmax_t) after.tv_nsec);
 
-  struct timespec diff = { .tv_sec = after.tv_sec - before.tv_sec,
-			   .tv_nsec = after.tv_nsec - before.tv_nsec };
-  if (diff.tv_nsec < 0)
-    {
-      --diff.tv_sec;
-      diff.tv_nsec += 1000000000;
-    }
-  if (diff.tv_sec != 0
-      || diff.tv_nsec > 600000000
-      || diff.tv_nsec < 100000000)
+  /* The bound values are empirically defined by testing this code over high cpu
+     usage and different nice values.  */
+  struct timespec diff = timespec_sub (support_timespec_normalize (after),
+				       support_timespec_normalize (before));
+  if (!support_timespec_check_in_range (sleeptime, diff, .0025,  1.3))
     {
       printf ("before - after %ju.%.9ju outside reasonable range\n",
 	      (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
@@ -194,19 +190,14 @@  do_test (void)
 	}
       else
 	{
-	  struct timespec d = { .tv_sec = afterns.tv_sec - after.tv_sec,
-				.tv_nsec = afterns.tv_nsec - after.tv_nsec };
-	  if (d.tv_nsec < 0)
-	    {
-	      --d.tv_sec;
-	      d.tv_nsec += 1000000000;
-	    }
-	  if (d.tv_sec > 0
-	      || d.tv_nsec < sleeptime.tv_nsec
-	      || d.tv_nsec > sleeptime.tv_nsec * 2)
+        /* The bound values are empirically defined by testing this code over
+           high cpu usage and different nice values.  */
+	  diff = timespec_sub (support_timespec_normalize (afterns),
+			       support_timespec_normalize (after));
+	  if (!support_timespec_check_in_range (sleeptime, diff, .71, 1.6))
 	    {
 	      printf ("nanosleep time %ju.%.9ju outside reasonable range\n",
-		      (uintmax_t) d.tv_sec, (uintmax_t) d.tv_nsec);
+		      (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
 	      result = 1;
 	    }
 	}
@@ -240,15 +231,12 @@  do_test (void)
   /* Should be close to 0.6.  */
   printf ("dead PID %d => %ju.%.9ju\n",
 	  child, (uintmax_t) dead.tv_sec, (uintmax_t) dead.tv_nsec);
-
-  diff.tv_sec = dead.tv_sec - after.tv_sec;
-  diff.tv_nsec = dead.tv_nsec - after.tv_nsec;
-  if (diff.tv_nsec < 0)
-    {
-      --diff.tv_sec;
-      diff.tv_nsec += 1000000000;
-    }
-  if (diff.tv_sec != 0 || diff.tv_nsec > 200000000)
+  /* The bound values are empirically defined by testing this code over high cpu
+     usage and different nice values.  */
+  diff = timespec_sub (support_timespec_normalize (dead),
+		       support_timespec_normalize (after));
+  sleeptime.tv_nsec = 100000000;
+  if (!support_timespec_check_in_range (sleeptime, diff, .7, 1.6))
     {
       printf ("dead - after %ju.%.9ju outside reasonable range\n",
 	      (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);