Message ID | 20200220143406.4768-1-adhemerval.zanella@linaro.org |
---|---|
State | Rejected |
Delegated to: | Carlos O'Donell |
Headers |
Received: (qmail 105050 invoked by alias); 20 Feb 2020 14:34:16 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 105036 invoked by uid 89); 20 Feb 2020 14:34:15 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-22.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mail-qk1-f194.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id; bh=jZj+M59spw/mIf/dTI2jIVGBwiO0iqCGgXe9nJ9TrTM=; b=WXTJSAHz6w6kkNVS9hUkPw69Vp/mzui34NSMr5nlGs2R2/KFO2CSxoiqcRMMvGW5mu 2fLdKyn1uO74aMBTQRoR/85rbIzUkt5fHcGb02kCql2WKwd85w8byU763YvN/ot07+4q YWKIKY9/Vqaq9BfvhU9dSDd4EEiWIyGXm1In+wRNcV6p/AJneEs0vO3cICQzWTLdhILE eP6Ol5BfAtm9sCvATCsUtO2nnT1Ttfoj+XnNNrHoJq4TVmUBG8WOpWM+Y5uXKcpEGTw0 IIHvoBrh2UTIe+EflP7VVvgNRQpfpypF5ekqrzg/WFtYW0i1q8drYO8dlgE+dmrv62Pw NnGQ== Return-Path: <adhemerval.zanella@linaro.org> From: Adhemerval Zanella <adhemerval.zanella@linaro.org> To: libc-alpha@sourceware.org Subject: [PATCH] support: Kill process group for test failure Date: Thu, 20 Feb 2020 11:34:06 -0300 Message-Id: <20200220143406.4768-1-adhemerval.zanella@linaro.org> |
Commit Message
Adhemerval Zanella Netto
Feb. 20, 2020, 2:34 p.m. UTC
Some testcases that create multiple subprocesses might abort or exit prior waiting for their children. In such case, support_test_main does not try to kill the spawned test process group (as in the test timeout case). On example that we are observing in internal tests is when malloc/tst-mallocfork2 fails to fork in the signal handling (due either maximum number of process or other non expected failure). This patch kill the process group in the case of failed execution, similar on how it is done on timeout. Checked on x86_64-linux-gnu. --- support/support_test_main.c | 7 +++++++ 1 file changed, 7 insertions(+)
Comments
On 2/20/20 9:34 AM, Adhemerval Zanella wrote: > Some testcases that create multiple subprocesses might abort or exit > prior waiting for their children. In such case, support_test_main > does not try to kill the spawned test process group (as in the > test timeout case). > > On example that we are observing in internal tests is when > malloc/tst-mallocfork2 fails to fork in the signal handling (due > either maximum number of process or other non expected failure). > > This patch kill the process group in the case of failed execution, > similar on how it is done on timeout. LGTM. Reviewed-by: Carlos O'Donell <carlos@redhat.com> > Checked on x86_64-linux-gnu. > --- > support/support_test_main.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/support/support_test_main.c b/support/support_test_main.c > index e3f0bf15f2..ac9f710fb7 100644 > --- a/support/support_test_main.c > +++ b/support/support_test_main.c > @@ -459,6 +459,9 @@ support_test_main (int argc, char **argv, const struct test_config *config) > /* Process terminated normaly without timeout etc. */ > if (WIFEXITED (status)) > { > + /* It is expected that a successful test execution handles all its > + children. */ OK. > + > if (config->expected_status == 0) > { > if (config->expected_signal == 0) > @@ -486,6 +489,10 @@ support_test_main (int argc, char **argv, const struct test_config *config) > /* Process was killed by timer or other signal. */ > else > { > + /* Kill the whole process group if test process aborts or exits prior > + waiting for them. */ > + kill (-test_pid, SIGKILL); OK. Send negation of process group id to kill the whole process group. The process group id is the same as the id of the process that created the group so test_pid is the right value. Notes: - Should we be using test_pgid to make this clear? - Should this cleanup be refactored a bit to avoid duplication from signal_handler() and support_test_main() e.g. kill_process_group () which runs kill looks for errors prints diagnostic etc. > + > if (config->expected_signal == 0) > { > printf ("Didn't expect signal from child: got `%s'\n", >
* Adhemerval Zanella: > Some testcases that create multiple subprocesses might abort or exit > prior waiting for their children. In such case, support_test_main > does not try to kill the spawned test process group (as in the > test timeout case). Does this actually work? Is the process group preserved if a process is reparented to init? Thanks, Florian
On Thu, Feb 20, 2020 at 10:05 AM Florian Weimer <fweimer@redhat.com> wrote: > > * Adhemerval Zanella: > > > Some testcases that create multiple subprocesses might abort or exit > > prior waiting for their children. In such case, support_test_main > > does not try to kill the spawned test process group (as in the > > test timeout case). > > Does this actually work? Is the process group preserved if a process is > reparented to init? No, you are right. There is a race too which I didn't see. Once you waitpid the pid and pgid might be free for reuse and we can't guarantee this will work. Cheers, Carlos.
On Thu, Feb 20, 2020 at 9:53 AM Carlos O'Donell <codonell@redhat.com> wrote: > > On 2/20/20 9:34 AM, Adhemerval Zanella wrote: > > Some testcases that create multiple subprocesses might abort or exit > > prior waiting for their children. In such case, support_test_main > > does not try to kill the spawned test process group (as in the > > test timeout case). > > > > On example that we are observing in internal tests is when > > malloc/tst-mallocfork2 fails to fork in the signal handling (due > > either maximum number of process or other non expected failure). > > > > This patch kill the process group in the case of failed execution, > > similar on how it is done on timeout. > > LGTM. > > Reviewed-by: Carlos O'Donell <carlos@redhat.com> I'm withdrawing my reviewed-by here, since there is a race. Florian highlighted that the children are all going to be reparented to init and that therefore we can't catch them anymore. The only plausible solution here is to use the controlling terminal to kill the orphan children. > > Checked on x86_64-linux-gnu. > > --- > > support/support_test_main.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/support/support_test_main.c b/support/support_test_main.c > > index e3f0bf15f2..ac9f710fb7 100644 > > --- a/support/support_test_main.c > > +++ b/support/support_test_main.c > > @@ -459,6 +459,9 @@ support_test_main (int argc, char **argv, const struct test_config *config) > > /* Process terminated normaly without timeout etc. */ > > if (WIFEXITED (status)) > > { > > + /* It is expected that a successful test execution handles all its > > + children. */ > > OK. > > > + > > if (config->expected_status == 0) > > { > > if (config->expected_signal == 0) > > @@ -486,6 +489,10 @@ support_test_main (int argc, char **argv, const struct test_config *config) > > /* Process was killed by timer or other signal. */ > > else > > { > > + /* Kill the whole process group if test process aborts or exits prior > > + waiting for them. */ > > + kill (-test_pid, SIGKILL); > > OK. Send negation of process group id to kill the whole process group. The > process group id is the same as the id of the process that created the group > so test_pid is the right value. > > Notes: > - Should we be using test_pgid to make this clear? > - Should this cleanup be refactored a bit to avoid duplication from > signal_handler() and support_test_main() e.g. kill_process_group () > which runs kill looks for errors prints diagnostic etc. > > > + > > if (config->expected_signal == 0) > > { > > printf ("Didn't expect signal from child: got `%s'\n", > > > > > -- > Cheers, > Carlos.
diff --git a/support/support_test_main.c b/support/support_test_main.c index e3f0bf15f2..ac9f710fb7 100644 --- a/support/support_test_main.c +++ b/support/support_test_main.c @@ -459,6 +459,9 @@ support_test_main (int argc, char **argv, const struct test_config *config) /* Process terminated normaly without timeout etc. */ if (WIFEXITED (status)) { + /* It is expected that a successful test execution handles all its + children. */ + if (config->expected_status == 0) { if (config->expected_signal == 0) @@ -486,6 +489,10 @@ support_test_main (int argc, char **argv, const struct test_config *config) /* Process was killed by timer or other signal. */ else { + /* Kill the whole process group if test process aborts or exits prior + waiting for them. */ + kill (-test_pid, SIGKILL); + if (config->expected_signal == 0) { printf ("Didn't expect signal from child: got `%s'\n",