tst-epoll: increase waiting time before sending signal to the child
Checks
Context |
Check |
Description |
redhat-pt-bot/TryBot-apply_patch |
success
|
Patch applied to master at the time it was sent
|
redhat-pt-bot/TryBot-32bit |
success
|
Build for i686
|
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_glibc_build--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_glibc_check--master-arm |
success
|
Testing passed
|
Commit Message
When running the testsuite in parallel, for instance running make -j
$(nproc) check, from time to time tst-epoll fails with a timeout. It
happens because it sometimes takes a bit more than 10ms for the process
to get cloned and blocked by the syscall. In that case the signal is
sent to early, and the test fails with a timeout. This happens even on
fast hosts.
This patch increases the waiting time to 100ms to make it more reliable.
It corresponds to 20% of the epoll wait time, so there is still some
margin on that side.
---
sysdeps/unix/sysv/linux/tst-epoll.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Jan 23 2024, Aurelien Jarno wrote:
> When running the testsuite in parallel, for instance running make -j
> $(nproc) check, from time to time tst-epoll fails with a timeout. It
> happens because it sometimes takes a bit more than 10ms for the process
> to get cloned and blocked by the syscall. In that case the signal is
> sent to early, and the test fails with a timeout. This happens even on
> fast hosts.
>
> This patch increases the waiting time to 100ms to make it more reliable.
> It corresponds to 20% of the epoll wait time, so there is still some
> margin on that side.
Can this be synchronized properly? A racy test is worthless.
On 23/01/24 19:23, Andreas Schwab wrote:
> On Jan 23 2024, Aurelien Jarno wrote:
>
>> When running the testsuite in parallel, for instance running make -j
>> $(nproc) check, from time to time tst-epoll fails with a timeout. It
>> happens because it sometimes takes a bit more than 10ms for the process
>> to get cloned and blocked by the syscall. In that case the signal is
>> sent to early, and the test fails with a timeout. This happens even on
>> fast hosts.
>>
>> This patch increases the waiting time to 100ms to make it more reliable.
>> It corresponds to 20% of the epoll wait time, so there is still some
>> margin on that side.
>
> Can this be synchronized properly? A racy test is worthless.
>
Maybe either:
static pthread_barrier_t *barrier;
shared_data = support_shared_allocate (sizeof (*shared_data));
{
pthread_barrierattr_t attr;
xpthread_barrierattr_init (&attr);
xpthread_barrierattr_setpshared (&attr, PTHREAD_PROCESS_SHARED);
xpthread_barrier_init (barrier, &attr, 2);
xpthread_barrierattr_destroy (&attr);
}
/* Child. */
xpthread_barrier_wait (&barrier);
epoll_ctl (...)
/* Parent. */
xpthread_barrier_wait (&barrier);
kill (...)
Or:
/* Parent. */
support_process_state_wait (pid, support_process_state_sleeping);
kill (...)
@@ -98,7 +98,7 @@ test_epoll_basic (epoll_wait_check_t epoll_wait_check)
xclose (fds[1][1]);
/* Wait some time so child is blocked on the syscall. */
- nanosleep (&(struct timespec) {0, 10000000}, NULL);
+ nanosleep (&(struct timespec) {0, 100000000}, NULL);
TEST_COMPARE (kill (p, SIGUSR1), 0);
int e = epoll_wait_check (efd, &event, 1, 500000000, &ss);