Loosen the limits of time/tst-cpuclock1.

Message ID 20200828085852.1861153-1-stli@linux.ibm.com
State Superseded
Headers
Series Loosen the limits of time/tst-cpuclock1. |

Commit Message

Stefan Liebler Aug. 28, 2020, 8:58 a.m. UTC
  Starting with the commit 04deeaa9ea74b0679dfc9d9155a37b6425f19a9f
"Fix time/tst-cpuclock1 intermitent failures" (2020-07-11),
this test fails quite often on s390x/s390 with one/multiple of those:
"before - after" / "nanosleep time" / "dead - after" ourside reasonable range.

On a zVM/kvm guest the CPUs are shared between multiple guests.
And even on the lpar (kvm host) the CPUs are usually shared between multiple lpars.
The defined CPUs for a lpar/zVM-system could also have lower weights compared
to other lpars which let the steal time further grow.

Usually I build (-j$(nproc)) and test (PARALLELMFLAGS="-j$(nproc)") glibc multiple
times, e.g. with different GCCs, on various lpars or zVM guests at the same time.
During this time, I've run the test for 13500 times and obvserved the following fails:
~600x "before - after"
~60x "nanosleep time"
~70x "dead - after"

I've also observed a lot of "before - after" fails on a intel kvm-guest while
building/testing glibc on it.

The mentioned commit has tighten the limits of valid tv_nsec ranges:
"before - after" (expected: 500000000):
- 100000000 ... 600000000
+ 450000000 ... 550000000

"nanosleep time" (expected: 100000000):
- 100000000 ... 200000000
+ 090000000 ... 120000000

"dead - after" (expected: 100000000):
-           ... 200000000
+ 090000000 ... 120000000

The test itself forks a child process which chew_cpu (user- and kernel-space).
The parent process sleeps with nanosleep(0.5s) and measures the child_clock time:
  

Comments

Florian Weimer Aug. 28, 2020, 12:29 p.m. UTC | #1
* Stefan Liebler via Libc-alpha:

> Starting with the commit 04deeaa9ea74b0679dfc9d9155a37b6425f19a9f "Fix
> time/tst-cpuclock1 intermitent failures" (2020-07-11), this test fails
> quite often on s390x/s390 with one/multiple of those: "before - after"
> / "nanosleep time" / "dead - after" ourside reasonable range.

How much value do these cpuclock tests actually have?  Maybe we should
just remove them, given that their test objective is so difficult to
model in current execution environments.

Maybe we could measure clock discontinuities on the CPU-burning thread
and detect CPU stealing this way.  But I'm not sure if this would be
worth it.

Thanks,
Florian
  
Lucas A. M. Magalhaes Aug. 31, 2020, 12:57 p.m. UTC | #2
Quoting Florian Weimer via Libc-alpha (2020-08-28 09:29:45)
> * Stefan Liebler via Libc-alpha:
> 
> > Starting with the commit 04deeaa9ea74b0679dfc9d9155a37b6425f19a9f "Fix
> > time/tst-cpuclock1 intermitent failures" (2020-07-11), this test fails
> > quite often on s390x/s390 with one/multiple of those: "before - after"
> > / "nanosleep time" / "dead - after" ourside reasonable range.
> 
> How much value do these cpuclock tests actually have?  Maybe we should
> just remove them, given that their test objective is so difficult to
> model in current execution environments.

AFAICS this test was moved from rt/ to time/ when the clock_* functions
were migrated to libc.so. IMHO this test lost it's purpose during this
migration.  I can see why it was needed on rt/, with strict time
measures, but on time/ it's just testing clock_* API. I this case I don't
see a downside to flex the time requirements.

A good thing to add for this test is a comment on the beginning stating
the purpose of it. :)

> 
> Maybe we could measure clock discontinuities on the CPU-burning thread
> and detect CPU stealing this way.  But I'm not sure if this would be
> worth it.

What we want to achieve with this?

---
Lucas A. M. Magalhães
  
Florian Weimer Aug. 31, 2020, 12:59 p.m. UTC | #3
* Lucas A. M. Magalhaes:

> Quoting Florian Weimer via Libc-alpha (2020-08-28 09:29:45)
>> * Stefan Liebler via Libc-alpha:
>> 
>> > Starting with the commit 04deeaa9ea74b0679dfc9d9155a37b6425f19a9f "Fix
>> > time/tst-cpuclock1 intermitent failures" (2020-07-11), this test fails
>> > quite often on s390x/s390 with one/multiple of those: "before - after"
>> > / "nanosleep time" / "dead - after" ourside reasonable range.
>> 
>> How much value do these cpuclock tests actually have?  Maybe we should
>> just remove them, given that their test objective is so difficult to
>> model in current execution environments.
>
> AFAICS this test was moved from rt/ to time/ when the clock_* functions
> were migrated to libc.so. IMHO this test lost it's purpose during this
> migration.  I can see why it was needed on rt/, with strict time
> measures, but on time/ it's just testing clock_* API. I this case I don't
> see a downside to flex the time requirements.

The location in the tree does not impact the test objectives.  The move
was done to reflect that for quite a few releases now, clock_gettime
has been provided by libc, not librt.

>> Maybe we could measure clock discontinuities on the CPU-burning thread
>> and detect CPU stealing this way.  But I'm not sure if this would be
>> worth it.
>
> What we want to achieve with this?

I think the goal is to ensure that glibc and the kernel agree about the
definition of the clocks, i.e. that the clock is indeed a CPU time clock
and does not measure wall-clock time.

Thanks,
Florian
  
Stefan Liebler Sept. 2, 2020, 4:10 p.m. UTC | #4
On 8/31/20 2:59 PM, Florian Weimer wrote:
> * Lucas A. M. Magalhaes:
> 
>> Quoting Florian Weimer via Libc-alpha (2020-08-28 09:29:45)
>>> * Stefan Liebler via Libc-alpha:
>>>
>>>> Starting with the commit 04deeaa9ea74b0679dfc9d9155a37b6425f19a9f "Fix
>>>> time/tst-cpuclock1 intermitent failures" (2020-07-11), this test fails
>>>> quite often on s390x/s390 with one/multiple of those: "before - after"
>>>> / "nanosleep time" / "dead - after" ourside reasonable range.
>>>
>>> How much value do these cpuclock tests actually have?  Maybe we should
>>> just remove them, given that their test objective is so difficult to
>>> model in current execution environments.
>>
>> AFAICS this test was moved from rt/ to time/ when the clock_* functions
>> were migrated to libc.so. IMHO this test lost it's purpose during this
>> migration.  I can see why it was needed on rt/, with strict time
>> measures, but on time/ it's just testing clock_* API. I this case I don't
>> see a downside to flex the time requirements.
> 
> The location in the tree does not impact the test objectives.  The move
> was done to reflect that for quite a few releases now, clock_gettime
> has been provided by libc, not librt.
> 
>>> Maybe we could measure clock discontinuities on the CPU-burning thread
>>> and detect CPU stealing this way.  But I'm not sure if this would be
>>> worth it.
>>
>> What we want to achieve with this?
> 
> I think the goal is to ensure that glibc and the kernel agree about the
> definition of the clocks, i.e. that the clock is indeed a CPU time clock
> and does not measure wall-clock time.
> 
> Thanks,
> Florian
> 

Thanks for your comments.

How do we want to proceed here:
- Shall we just loosen the limits?
- Shall we remove the whole test?
- Shall we remove only the first check which compares nanosleep vs
clock_gettime (child_clock, before|after)?

Bye.
Stefan
  
Florian Weimer Sept. 21, 2020, 11:28 a.m. UTC | #5
* Stefan Liebler:

> How do we want to proceed here:
> - Shall we just loosen the limits?
> - Shall we remove the whole test?
> - Shall we remove only the first check which compares nanosleep vs
> clock_gettime (child_clock, before|after)?

I lean towards removing both time/tst-cpuclock1 and time/tst-cpuclock2.

Thanks,
Florian
  
Lucas A. M. Magalhaes Sept. 29, 2020, 1:53 p.m. UTC | #6
Quoting Florian Weimer (2020-09-21 08:28:31)
> * Stefan Liebler:
> 
> > How do we want to proceed here:
> > - Shall we just loosen the limits?
> > - Shall we remove the whole test?
> > - Shall we remove only the first check which compares nanosleep vs
> > clock_gettime (child_clock, before|after)?
> 
> I lean towards removing both time/tst-cpuclock1 and time/tst-cpuclock2.
> 

I don't oppose against removing them, also.

---
Lucas A. M. Magalhães
  
Adhemerval Zanella Netto Sept. 29, 2020, 2:01 p.m. UTC | #7
On 29/09/2020 10:53, Lucas A. M. Magalhaes via Libc-alpha wrote:
> Quoting Florian Weimer (2020-09-21 08:28:31)
>> * Stefan Liebler:
>>
>>> How do we want to proceed here:
>>> - Shall we just loosen the limits?
>>> - Shall we remove the whole test?
>>> - Shall we remove only the first check which compares nanosleep vs
>>> clock_gettime (child_clock, before|after)?
>>
>> I lean towards removing both time/tst-cpuclock1 and time/tst-cpuclock2.
>>
> 
> I don't oppose against removing them, also.
> 

I also lean to remove these tests. If we need to keep adjusting the time
limits depending of the underlying architecture the tests might loose
their intention to check the interface and/or not indicate a possible
regression.
  
Carlos O'Donell Sept. 29, 2020, 5:22 p.m. UTC | #8
On 9/29/20 10:01 AM, Adhemerval Zanella via Libc-alpha wrote:
> 
> 
> On 29/09/2020 10:53, Lucas A. M. Magalhaes via Libc-alpha wrote:
>> Quoting Florian Weimer (2020-09-21 08:28:31)
>>> * Stefan Liebler:
>>>
>>>> How do we want to proceed here:
>>>> - Shall we just loosen the limits?
>>>> - Shall we remove the whole test?
>>>> - Shall we remove only the first check which compares nanosleep vs
>>>> clock_gettime (child_clock, before|after)?
>>>
>>> I lean towards removing both time/tst-cpuclock1 and time/tst-cpuclock2.
>>>
>>
>> I don't oppose against removing them, also.
>>
> 
> I also lean to remove these tests. If we need to keep adjusting the time
> limits depending of the underlying architecture the tests might loose
> their intention to check the interface and/or not indicate a possible
> regression.

The tests should be removed because they contain *non-timing* related
regression tests for:

* clock_getcpuclockid vs. ENOSYS / ESRCH / EPERM
* clock_getcpuclockid vs. valid child
* clock_gettime of dead child where clock is no longer valid

I don't see any other tests that test for that.

If we want we can just strip out the time-dependent parts of the tests?
  
Adhemerval Zanella Netto Sept. 30, 2020, 11:48 a.m. UTC | #9
On 29/09/2020 14:22, Carlos O'Donell wrote:
> On 9/29/20 10:01 AM, Adhemerval Zanella via Libc-alpha wrote:
>>
>>
>> On 29/09/2020 10:53, Lucas A. M. Magalhaes via Libc-alpha wrote:
>>> Quoting Florian Weimer (2020-09-21 08:28:31)
>>>> * Stefan Liebler:
>>>>
>>>>> How do we want to proceed here:
>>>>> - Shall we just loosen the limits?
>>>>> - Shall we remove the whole test?
>>>>> - Shall we remove only the first check which compares nanosleep vs
>>>>> clock_gettime (child_clock, before|after)?
>>>>
>>>> I lean towards removing both time/tst-cpuclock1 and time/tst-cpuclock2.
>>>>
>>>
>>> I don't oppose against removing them, also.
>>>
>>
>> I also lean to remove these tests. If we need to keep adjusting the time
>> limits depending of the underlying architecture the tests might loose
>> their intention to check the interface and/or not indicate a possible
>> regression.
> 
> The tests should be removed because they contain *non-timing* related
> regression tests for:

I think you meant 'should *not* be remove* based on the points below.

> 
> * clock_getcpuclockid vs. ENOSYS / ESRCH / EPERM
> * clock_getcpuclockid vs. valid child
> * clock_gettime of dead child where clock is no longer valid
> 
> I don't see any other tests that test for that.
> 
> If we want we can just strip out the time-dependent parts of the tests?
> 

This is better idea indeed.
  
Stefan Liebler Oct. 19, 2020, 2:48 p.m. UTC | #10
On 9/30/20 1:48 PM, Adhemerval Zanella via Libc-alpha wrote:
> 
> 
> On 29/09/2020 14:22, Carlos O'Donell wrote:
>> On 9/29/20 10:01 AM, Adhemerval Zanella via Libc-alpha wrote:
>>>
>>>
>>> On 29/09/2020 10:53, Lucas A. M. Magalhaes via Libc-alpha wrote:
>>>> Quoting Florian Weimer (2020-09-21 08:28:31)
>>>>> * Stefan Liebler:
>>>>>
>>>>>> How do we want to proceed here:
>>>>>> - Shall we just loosen the limits?
>>>>>> - Shall we remove the whole test?
>>>>>> - Shall we remove only the first check which compares nanosleep vs
>>>>>> clock_gettime (child_clock, before|after)?
>>>>>
>>>>> I lean towards removing both time/tst-cpuclock1 and time/tst-cpuclock2.
>>>>>
>>>>
>>>> I don't oppose against removing them, also.
>>>>
>>>
>>> I also lean to remove these tests. If we need to keep adjusting the time
>>> limits depending of the underlying architecture the tests might loose
>>> their intention to check the interface and/or not indicate a possible
>>> regression.
>>
>> The tests should be removed because they contain *non-timing* related
>> regression tests for:
> 
> I think you meant 'should *not* be remove* based on the points below.
> 
>>
>> * clock_getcpuclockid vs. ENOSYS / ESRCH / EPERM
>> * clock_getcpuclockid vs. valid child
>> * clock_gettime of dead child where clock is no longer valid
>>
>> I don't see any other tests that test for that.
>>
>> If we want we can just strip out the time-dependent parts of the tests?
>>
> 
> This is better idea indeed.
> 

Hi,

Sorry for the long delay.
According to all the feedback, I've kept the test itself and removed two
of the time-dependent checks.
Please have a look at v2 of the patch:
"[PATCH v2] Loosen the limits of time/tst-cpuclock1."
https://sourceware.org/pipermail/libc-alpha/2020-October/118779.html

Thanks,
Stefan
  

Patch

diff = after - before"
With much workload on the machine, the child won't make much progess
and it can fall much beyond the minimum limit. Thus I've set this limit to 0.45%.
Even with this limit, the test fails for 11..90 times. If set to 0.30% it fails
only 1..8 times.
Does this check makes sense at all?

Afterwards the parent process sleeps  with clock_nanosleep (child_clock, 0.1s):
diff = afterns - after
The test currently also allows 0.9 * 0.1s.  Shouldn't we test the hard limit of
1.0 * 0.1s as minimum limit?
Depending on the workload, the maximum limit can exceed the 1.2 * 0.1s. Therefore
I've set the upper limit to 1.35.

For "dead - after", the parent process kills the child process and waits long
enough to let the child finish dying. Then it gets the time of the child:
diff = dead - after
Note that diff also contains the time for the previous clock_nanosleep.
Thus you'll often see both fails at the same time. I've set the upper limit
to 1.4.
---
 time/tst-cpuclock1.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/time/tst-cpuclock1.c b/time/tst-cpuclock1.c
index 1ac611a92b..52a3485145 100644
--- a/time/tst-cpuclock1.c
+++ b/time/tst-cpuclock1.c
@@ -161,7 +161,7 @@  do_test (void)
      percentile of values and use those values for our testing allowed range.  */
   struct timespec diff = timespec_sub (support_timespec_normalize (after),
 				       support_timespec_normalize (before));
-  if (!support_timespec_check_in_range (sleeptime, diff, .9,  1.1))
+  if (!support_timespec_check_in_range (sleeptime, diff, .45,  1.1))
     {
       printf ("before - after %ju.%.9ju outside reasonable range\n",
 	      (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
@@ -197,7 +197,7 @@  do_test (void)
            allowed range.  */
 	  diff = timespec_sub (support_timespec_normalize (afterns),
 			       support_timespec_normalize (after));
-	  if (!support_timespec_check_in_range (sleeptime, diff, .9, 1.2))
+	  if (!support_timespec_check_in_range (sleeptime, diff, 1.0, 1.35))
 	    {
 	      printf ("nanosleep time %ju.%.9ju outside reasonable range\n",
 		      (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
@@ -240,7 +240,7 @@  do_test (void)
   diff = timespec_sub (support_timespec_normalize (dead),
 		       support_timespec_normalize (after));
   sleeptime.tv_nsec = 100000000;
-  if (!support_timespec_check_in_range (sleeptime, diff, .9, 1.2))
+  if (!support_timespec_check_in_range (sleeptime, diff, 1.0, 1.4))
     {
       printf ("dead - after %ju.%.9ju outside reasonable range\n",
 	      (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);