From patchwork Tue May 17 06:57:13 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Liebler X-Patchwork-Id: 12304 Received: (qmail 102255 invoked by alias); 17 May 2016 06:57:32 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 102205 invoked by uid 89); 17 May 2016 06:57:31 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.6 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=D*linux.vnet.ibm.com, D*vnet.ibm.com, cancellation, 33322 X-HELO: plane.gmane.org To: libc-alpha@sourceware.org From: Stefan Liebler Subject: [PATCH] Fix tst-cancel17/tst-cancelx17, which sometimes segfaults while exiting. Date: Tue, 17 May 2016 08:57:13 +0200 Lines: 115 Message-ID: Mime-Version: 1.0 X-Mozilla-News-Host: news://news.gmane.org:119 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 Hi, The testcase tst-cancel[x]17 ends sometimes with a segmentation fault. This happens in one of 10000 cases. Then the real testcase has already exited with success and returned from do_test(). The segmentation fault occurs after returning from main in _dl_fini(). In those cases, the aio_read(&a) was not canceled, because the read request was already in progress. In the meanwhile aio_write(ap) wrote something to the pipe and the read request is able to read the requested byte. The read request hasn't finished before returning from do_test(). After it finishes, it writes the return value and error code from the read syscall to the struct aiocb a, which lays on the stack of do_test. The stack of the subsequent function call of _dl_fini or _dl_sort_fini, which is inlined in _dl_fini, is corrupted. In case of S390, it reads a zero and decrements it by 1: unsigned int k = nmaps - 1; struct link_map **runp = maps[k]->l_initfini; The subsequent load from unmapped memory leads to the segmentation fault. The stack corruption also happens on other architectures. I saw them e.g. on x86 and ppc, too. This patch adds an aio_suspend call to ensure, that the read request is finished before returning from do_test(). Okay to commit? Bye Stefan ChangeLog: * nptl/tst-cancel17.c (do_test): Wait for finishing aio_read(&a). commit fd27163dc65df76132946c3eca29dfb82fea2fcc Author: Stefan Liebler Date: Fri May 13 15:10:30 2016 -0400 Fix tst-cancel17/tst-cancelx17, which sometimes segfaults while exiting. The testcase tst-cancel[x]17 ends sometimes with a segmentation fault. This happens in one of 10000 cases. Then the real testcase has already exited with success and returned from do_test(). The segmentation fault occurs after returning from main in _dl_fini(). In those cases, the aio_read(&a) was not canceled, because the read request was already in progress. In the meanwhile aio_write(ap) wrote something to the pipe and the read request is able to read the requested byte. The read request hasn't finished before returning from do_test(). After it finishes, it writes the return value and error code from the read syscall to the struct aiocb a, which lays on the stack of do_test. The stack of the subsequent function call of _dl_fini or _dl_sort_fini, which is inlined in _dl_fini is corrupted. In case of S390, it reads a zero and decrements it by 1: unsigned int k = nmaps - 1; struct link_map **runp = maps[k]->l_initfini; The load from unmapped memory leads to the segmentation fault. The stack corruption also happens on other architectures. I saw them e.g. on x86 and ppc, too. This patch adds an aio_suspend call to ensure, that the read request is finished before returning from do_test(). ChangeLog: * nptl/tst-cancel17.c (do_test): Wait for finishing aio_read(&a). diff --git a/nptl/tst-cancel17.c b/nptl/tst-cancel17.c index fb89292..9ff4e27 100644 --- a/nptl/tst-cancel17.c +++ b/nptl/tst-cancel17.c @@ -333,6 +333,22 @@ do_test (void) puts ("early cancellation succeeded"); + if (ap == &a2) + { + /* The aio_read(&a) was not canceled, because the read request was + already in progress. In the meanwhile aio_write(ap) wrote something + to the pipe and the read request either has already been finished or + is able to read the requested byte. + Wait for the read request before returning from this function, because + the return value and error code from the read syscall will be written + to the struct aiocb a, which lays on the stack of this function. + Otherwise the stack from subsequent function calls - e.g. _dl_fini - + will be corrupted, which can lead to undefined behaviour like a + segmentation fault. */ + const struct aiocb *l[1] = { &a }; + TEMP_FAILURE_RETRY (aio_suspend(l, 1, NULL)); + } + return 0; }