From patchwork Thu May 27 17:28:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 43609 X-Patchwork-Delegate: fweimer@redhat.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6EE743959E52; Thu, 27 May 2021 17:28:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6EE743959E52 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1622136514; bh=nEbzpQuPbu3mv3pYXtWNQ9pLfOMpsj9Uoi0vcQek6ZQ=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=qDM3WG1m63U2zLcgRZKHW8lnUzQucCiN3DYYggDVkzX3CwNxBJkjTvlPNsyyoHuCs EEmQy/wgHPRayD81NBMSOKBLNszfN/c60StBXJCLLQuhQCKfd5soiLO79yzJFQYLDs MG81K5iTn13jZLDIOmD7tbKMb5d2IXXrSjtT/VbE= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) by sourceware.org (Postfix) with ESMTPS id 338053844007 for ; Thu, 27 May 2021 17:28:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 338053844007 Received: by mail-qk1-x72b.google.com with SMTP id o3so1421047qke.7 for ; Thu, 27 May 2021 10:28:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nEbzpQuPbu3mv3pYXtWNQ9pLfOMpsj9Uoi0vcQek6ZQ=; b=XgQgKzQWhJAGAkGznV1qHwPN9UNPVeY4JF/DJRaGmG6uvJqyMcSoa0b/HRgb8KE6iS sZegDGccTlC74he7K0itx/u3gNXeeGMNfA+mU9QyJiT8cEFu5qsNNWXvxDcIsErb0xYG NU/Uc2jpmOVrTlkXYnMT1q3BDaSsfPQjD/9iznyQ39KjzjSoDxIlR6Kwp6uvPTHqCa9y r6yJMQI7/Pkh97QttgHnRdkMMODcj4FtqiRiXogl+LL5stjiwGyT7lqoqY6Wk9SfL8He BBB3VZOIEQUAp9j7mEYu544SMG1MnYwlYldbpPLf3IC9jTiYik+oaOzWrEf7sTj2Kg0k 7QPA== X-Gm-Message-State: AOAM530e7z3i1M9kdVKWmBUHcpM8SSaCwBhx5RfUsyJqUkg89hPzv4IO N0FkGlqsoXZ+qAHe6vT2L4nMzwQjQVXWCw== X-Google-Smtp-Source: ABdhPJzKH9SNnyvKyZyUoX12DS5xC1Z2+fqkp+wRXatKtFy6ohA94dLGDfh7F5okJnQEUdH3UFySiA== X-Received: by 2002:a37:7c02:: with SMTP id x2mr4619750qkc.483.1622136510358; Thu, 27 May 2021 10:28:30 -0700 (PDT) Received: from birita.. ([177.194.37.86]) by smtp.googlemail.com with ESMTPSA id 25sm1767913qtd.51.2021.05.27.10.28.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 May 2021 10:28:30 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v2 2/9] nptl: Deallocate the thread stack on setup failure (BZ #19511) Date: Thu, 27 May 2021 14:28:16 -0300 Message-Id: <20210527172823.3461314-3-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210527172823.3461314-1-adhemerval.zanella@linaro.org> References: <20210527172823.3461314-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" To setup either the thread scheduling parameters or affinity, pthread_create enforce synchronization on created thread wait until its parent either unlock the 'struct pthread' or send a cancellation signal if a failure occurs. However, cancelling the thread does not deallocate the newly created stack since cancellation expects that a pthread_join to deallocate any allocated thread resouces (threads stack or TLS). I used a different strategy than the one proposed on BZ#19511 comment #4. Making the parent responsible for the cleanup requires additional synchronization similar to what pthread_join does. Instead, this patch reassigns an unused 'struct pthread' member, parent_cancelhandling_unsed, to indicate whether the setup has failed and set the thread itself to deallocate the allocated resouces (similar to what detached mode does). This strategy also simplifies by not using thread cancellation and thus not running libgcc_so load in the signal handler (which is avoided in thread cancellation since 'pthread_cancel' is the one responsible to dlopen libgcc_s). Checked on x86_64-linux-gnu and aarch64-linux-gnu. --- nptl/allocatestack.c | 1 + nptl/descr.h | 5 ++-- nptl/pthread_create.c | 70 ++++++++++++++++++++----------------------- 3 files changed, 36 insertions(+), 40 deletions(-) diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c index dc81a2ca73..2114bd2e27 100644 --- a/nptl/allocatestack.c +++ b/nptl/allocatestack.c @@ -161,6 +161,7 @@ get_cached_stack (size_t *sizep, void **memp) /* Cancellation handling is back to the default. */ result->cancelhandling = 0; result->cleanup = NULL; + result->setup_failed = 0; /* No pending event. */ result->nextevent = NULL; diff --git a/nptl/descr.h b/nptl/descr.h index 3de9535449..9d8297b45f 100644 --- a/nptl/descr.h +++ b/nptl/descr.h @@ -340,8 +340,9 @@ struct pthread /* True if thread must stop at startup time. */ bool stopped_start; - /* Formerly used for dealing with cancellation. */ - int parent_cancelhandling_unsed; + /* Indicate that a thread creation setup has failed (for instance the + scheduler or affinity). */ + int setup_failed; /* Lock to synchronize access to the descriptor. */ int lock; diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c index aacbba1cac..018af30c85 100644 --- a/nptl/pthread_create.c +++ b/nptl/pthread_create.c @@ -169,17 +169,15 @@ late_init (void) (c) If the detached thread setup failed and THREAD_RAN is true, then the creating thread releases ownership to the new thread by - sending a cancellation signal. All threads set THREAD_RAN to - true as quickly as possible after returning from the OS kernel's - thread creation routine. + setting the PD->setup_failed 1 and releasing PD->lock. All threads + set THREAD_RAN to true as quickly as possible after returning from + the OS kernel's thread creation routine. (d) If the joinable thread setup failed and THREAD_RAN is true, then then the creating thread retains ownership of PD and must cleanup state. Ownership cannot be released to the process via the return of pthread_create since a non-zero result entails PD is undefined and therefore cannot be joined to free the resources. - We privately call pthread_join on the thread to finish handling - the resource shutdown (Or at least we should, see bug 19511). (e) If the thread creation failed and THREAD_RAN is false, then the creating thread retains ownership of PD and must cleanup state. @@ -239,8 +237,8 @@ late_init (void) The return value is zero for success or an errno code for failure. If the return value is ENOMEM, that will be translated to EAGAIN, so create_thread need not do that. On failure, *THREAD_RAN should - be set to true iff the thread actually started up and then got - canceled before calling user code (*PD->start_routine). */ + be set to true iff the thread actually started up but before calling + the user code (*PD->start_routine). */ static int _Noreturn start_thread (void *arg); @@ -308,35 +306,23 @@ static int create_thread (struct pthread *pd, const struct pthread_attr *attr, == -1)) return errno; - /* It's started now, so if we fail below, we'll have to cancel it - and let it clean itself up. */ + /* It's started now, so if we fail below, we'll have to let it clean itself + up. */ *thread_ran = true; /* Now we have the possibility to set scheduling parameters etc. */ if (attr != NULL) { - int res; - /* Set the affinity mask if necessary. */ if (need_setaffinity) { assert (*stopped_start); - res = INTERNAL_SYSCALL_CALL (sched_setaffinity, pd->tid, - attr->extension->cpusetsize, - attr->extension->cpuset); - + int res = INTERNAL_SYSCALL_CALL (sched_setaffinity, pd->tid, + attr->extension->cpusetsize, + attr->extension->cpuset); if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (res))) - err_out: - { - /* The operation failed. We have to kill the thread. - We let the normal cancellation mechanism do the work. */ - - pid_t pid = __getpid (); - INTERNAL_SYSCALL_CALL (tgkill, pid, pd->tid, SIGCANCEL); - - return INTERNAL_SYSCALL_ERRNO (res); - } + return INTERNAL_SYSCALL_ERRNO (res); } /* Set the scheduling parameters. */ @@ -344,11 +330,10 @@ static int create_thread (struct pthread *pd, const struct pthread_attr *attr, { assert (*stopped_start); - res = INTERNAL_SYSCALL_CALL (sched_setscheduler, pd->tid, - pd->schedpolicy, &pd->schedparam); - + int res = INTERNAL_SYSCALL_CALL (sched_setscheduler, pd->tid, + pd->schedpolicy, &pd->schedparam); if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (res))) - goto err_out; + return INTERNAL_SYSCALL_ERRNO (res); } } @@ -424,17 +409,22 @@ start_thread (void *arg) have ownership (see CONCURRENCY NOTES above). */ if (__glibc_unlikely (pd->stopped_start)) { - int oldtype = LIBC_CANCEL_ASYNC (); - /* Get the lock the parent locked to force synchronization. */ lll_lock (pd->lock, LLL_PRIVATE); /* We have ownership of PD now. */ + if (pd->setup_failed == 1) + { + /* In the case of a failed setup, the created thread will be + responsible to free up the allocated resources. The + detached mode ensure it won't be joined and it will trigger + the required cleanup. */ + pd->joinid = pd; + __do_cancel (); + } /* And give it up right away. */ lll_unlock (pd->lock, LLL_PRIVATE); - - LIBC_CANCEL_RESET (oldtype); } LIBC_PROBE (pthread_start, 3, (pthread_t) pd, pd->start_routine, pd->arg); @@ -561,7 +551,7 @@ start_thread (void *arg) pd->setxid_futex = 0; } - /* If the thread is detached free the TCB. */ + /* If the thread is detached or an setup error occurred, free the TCB. */ if (IS_DETACHED (pd)) /* Free the TCB. */ __nptl_free_tcb (pd); @@ -761,7 +751,6 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr, signal mask of this thread, so save it in the startup information. */ pd->sigmask = original_sigmask; - /* Reset the cancellation signal mask in case this thread is running cancellation. */ __sigdelset (&pd->sigmask, SIGCANCEL); @@ -820,9 +809,14 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr, CONCURRENCY NOTES above). We can assert that STOPPED_START must have been true because thread creation didn't fail, but thread attribute setting did. */ - /* See bug 19511 which explains why doing nothing here is a - resource leak for a joinable thread. */ - assert (stopped_start); + { + assert (stopped_start); + /* State (a), we own PD. The thread blocked on this lock will + finish due setup_failed being set. The thread will be + responsible to cleanup any allocated resource. */ + pd->setup_failed = 1; + lll_unlock (pd->lock, LLL_PRIVATE); + } else { /* State (e) and we have ownership of PD (see CONCURRENCY