From patchwork Fri Nov 17 18:44:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 80170 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4B1C7385B83C for ; Fri, 17 Nov 2023 18:44:45 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id E57853858006 for ; Fri, 17 Nov 2023 18:44:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E57853858006 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E57853858006 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700246655; cv=none; b=MFR/A7oohd6o5x5DPsU5SyFgCh+oJmzoUPeXVmMqjo//7fU6vlrhSbAiERLA7Bd/IZm1t90JTymQixlm7V2me7aGH+jKuL2DFulhd7F/lfMd4IhwH4xexXHTaMu6IbThPo6p2xJUO6p5oiqs6Y1qHV60iptfGfDSlnEmEayDnwU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700246655; c=relaxed/simple; bh=2JymALvV080LEpyZ2ZrYpLiBI3UUJdFsCSYp4KNmj8c=; h=DKIM-Signature:From:To:Subject:Message-ID:Date:MIME-Version; b=gouGDYgQG+SNTvhehuSI9jkzPCZuTiQnMreppDNeALWGxpv4DfdX2CTrFf7f9+uLtqk3d0y5uMnf4cWajvPows2z/wmClG6zQQz8lzqRrMU0aD1CPgrDcz6tQ1b/+vskMDcpDmNetixqbpWrIw9Is2W7K5p18b6Ybd3qfzq8xq4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700246654; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=MSSPhdVUZm2JD53sl0YsUFq2OxjU/2R9Hfmxpx8AIpU=; b=PW1rLF70TVAq7WjNguW5EUWnlajCiWbhuThpL/V4eDu6KPvJzX3BZjH9uxvk7XxoGpT2mV LOpm7oQ2VaL+si24B1FqQ9DzgAeQeeZ8ZRJhs10joIgQwm6PnLCH7ID+9z40W568K2U2jT mBdGstDTttCMqHIsrkTfR7i1Zd/4Y1I= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-264--LXSPE2rMSypjM5LhrD-Dw-1; Fri, 17 Nov 2023 13:44:13 -0500 X-MC-Unique: -LXSPE2rMSypjM5LhrD-Dw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F00C08007B3 for ; Fri, 17 Nov 2023 18:44:12 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.2.16.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9C6362026D66 for ; Fri, 17 Nov 2023 18:44:12 +0000 (UTC) From: Florian Weimer To: libc-alpha@sourceware.org Subject: [PATCH 1/3] stdlib: Avoid another self-comparison in qsort In-Reply-To: Message-ID: <58229f369d7bb3eb1883de9ba0ff2ad974853c5b.1700246487.git.fweimer@redhat.com> References: X-From-Line: 58229f369d7bb3eb1883de9ba0ff2ad974853c5b Mon Sep 17 00:00:00 2001 Date: Fri, 17 Nov 2023 19:44:10 +0100 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.3 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org In the insertion phase, we could run off the start of the array if the comparison function never runs zero. In that case, it never finds the initial element that terminates the iteration. Reviewed-by: Adhemerval Zanella --- stdlib/qsort.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index ad110e8a89..6d0c4447ec 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -217,7 +217,7 @@ insertion_sort_qsort_partitions (void *const pbase, size_t total_elems, while ((run_ptr += size) <= end_ptr) { tmp_ptr = run_ptr - size; - while (cmp (run_ptr, tmp_ptr, arg) < 0) + while (run_ptr != tmp_ptr && cmp (run_ptr, tmp_ptr, arg) < 0) tmp_ptr -= size; tmp_ptr += size; From patchwork Fri Nov 17 18:44:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 80168 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4AF14385AE58 for ; Fri, 17 Nov 2023 18:44:35 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 7FF683858018 for ; Fri, 17 Nov 2023 18:44:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7FF683858018 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7FF683858018 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700246661; cv=none; b=rnzMzeIUAjS42lbxUCWOs2upDxqR/r+wgbU7Ahz8pYFXvcXCuU+IyCRi4gmq0gPhvQSw5OUOiJfStwkWpfSHFwipmRQ++gANBJbs/FsYDmbesUe0BgeSHWY1/2vKjl6xckoA2/ZVm/QMUtpSrODSUO0SWvZkfo7ORMCGJEPjcz0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700246661; c=relaxed/simple; bh=vkoGIyqMMsbq4nX+BDP48EYMNcINDns0Idy/FIq/eHo=; h=DKIM-Signature:From:To:Subject:Message-ID:Date:MIME-Version; b=KYt/Np9dW43yJh9eYvV04rkejaXBNVjQVrkb/mg6Fw1srpksFHB8BCGLcFuzyZiwRkSEygH7H3Eb6lAETAi42zEdY3Rzy+9jPPrlYUS+IFJc/wncHi4dq7j3E9jKhFvYIYw9zzqJ6mJi6C38EP9GwJHUdAcsIZoyTigqvttLwDc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700246659; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=xkay98Oget302l6qul8LB6q9wuMeL6wj27RmpkL6Nls=; b=Xfa25ytckcOYtPiV2DbwEEJsYB4m/bS/9LPVtLNPoezFOysgyJl90vbHHzG+6UAaU9bKPu uzv/jLmShLaGvY9JZQMFFml3Z1WHFyEZtU3YPqfAOMVJFouF6FZud6F9m6z2ChRf32hsJT ac1/pTxKKr9qZqNNVTgr6D9Mhc9+9R8= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-608-jlNX5L0DOrGBhZnwxv9V1A-1; Fri, 17 Nov 2023 13:44:17 -0500 X-MC-Unique: jlNX5L0DOrGBhZnwxv9V1A-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 59D87285F98C for ; Fri, 17 Nov 2023 18:44:17 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.2.16.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7E8CD492BE8 for ; Fri, 17 Nov 2023 18:44:16 +0000 (UTC) From: Florian Weimer To: libc-alpha@sourceware.org Subject: [PATCH 2/3] stdlib: Handle various corner cases in the fallback heapsort for qsort In-Reply-To: Message-ID: <832cc3db076991ad9b4e2f2c1f85133d335181a3.1700246487.git.fweimer@redhat.com> References: X-From-Line: 832cc3db076991ad9b4e2f2c1f85133d335181a3 Mon Sep 17 00:00:00 2001 Date: Fri, 17 Nov 2023 19:44:14 +0100 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.3 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org The previous implementation did not consistently apply the rule that the child nodes of node K are at 2 * K + 1 and 2 * K + 2, or that the parent node is at (K - 1) / 2. Add an internal test that targets the heapsort implementation directly. Reported-by: Stepan Golosunov Reviewed-by: Adhemerval Zanella --- stdlib/Makefile | 1 + stdlib/qsort.c | 55 ++++++++++++------ stdlib/tst-qsort4.c | 134 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 173 insertions(+), 17 deletions(-) create mode 100644 stdlib/tst-qsort4.c diff --git a/stdlib/Makefile b/stdlib/Makefile index 6af606136e..48688f6a27 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -261,6 +261,7 @@ tests := \ # tests tests-internal := \ + tst-qsort4 \ tst-strtod1i \ tst-strtod3 \ tst-strtod4 \ diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 6d0c4447ec..a2f9e916ef 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -125,29 +125,44 @@ pop (stack_node *top, char **lo, char **hi, size_t *depth) return top; } -/* NB: N is inclusive bound for BASE. */ +/* Establish the heap condition at index K, that is, the key at K will + not be less than either of its children, at 2 * K + 1 and 2 * K + 2 + (if they exist). N is the last valid index. */ static inline void siftdown (void *base, size_t size, size_t k, size_t n, enum swap_type_t swap_type, __compar_d_fn_t cmp, void *arg) { - while (k <= n / 2) + /* There can only be a heap condition violation if there are + children. */ + while (2 * k + 1 <= n) { - size_t j = 2 * k; + /* Left child. */ + size_t j = 2 * k + 1; + /* If the right child is larger, use it. */ if (j < n && cmp (base + (j * size), base + ((j + 1) * size), arg) < 0) j++; + /* If k is already >= to its children, we are done. */ if (j == k || cmp (base + (k * size), base + (j * size), arg) >= 0) break; + /* Heal the violation. */ do_swap (base + (size * j), base + (k * size), size, swap_type); + + /* Swapping with j may have introduced a violation at j. Fix + it in the next loop iteration. */ k = j; } } +/* Establish the heap condition for the indices 0 to N (inclusive). */ static inline void heapify (void *base, size_t size, size_t n, enum swap_type_t swap_type, __compar_d_fn_t cmp, void *arg) { + /* If n is odd, k = n / 2 has a left child at n, so this is the + largest index that can have a heap condition violation regarding + its children. */ size_t k = n / 2; while (1) { @@ -157,32 +172,38 @@ heapify (void *base, size_t size, size_t n, enum swap_type_t swap_type, } } -/* A non-recursive heapsort, used on introsort implementation as a fallback - routine with worst-case performance of O(nlog n) and worst-case space - complexity of O(1). It sorts the array starting at BASE and ending at - END, with each element of SIZE bytes. The SWAP_TYPE is the callback - function used to swap elements, and CMP is the function used to compare - elements. */ +/* A non-recursive heapsort, used on introsort implementation as a + fallback routine with worst-case performance of O(nlog n) and + worst-case space complexity of O(1). It sorts the array starting + at BASE and ending at END (inclusive), with each element of SIZE + bytes. The SWAP_TYPE is the callback function used to swap + elements, and CMP is the function used to compare elements. */ static void heapsort_r (void *base, void *end, size_t size, enum swap_type_t swap_type, __compar_d_fn_t cmp, void *arg) { - const size_t count = ((uintptr_t) end - (uintptr_t) base) / size; - - if (count < 2) + size_t n = ((uintptr_t) end - (uintptr_t) base) / size; + if (n <= 1) + /* Handled by insertion sort. */ return; - size_t n = count - 1; - /* Build the binary heap, largest value at the base[0]. */ heapify (base, size, n, swap_type, cmp, arg); - /* On each iteration base[0:n] is the binary heap, while base[n:count] - is sorted. */ - while (n > 0) + while (true) { + /* Indices 0 .. n contain the binary heap. Extract the largest + element put it into the final position in the array. */ do_swap (base, base + (n * size), size, swap_type); + + /* The heap is now one element shorter. */ n--; + if (n == 0) + break; + + /* By swapping in elements 0 and the previous value of n (now at + n + 1), we likely introduced a heap condition violation. Fix + it for the reduced heap. */ siftdown (base, size, 0, n, swap_type, cmp, arg); } } diff --git a/stdlib/tst-qsort4.c b/stdlib/tst-qsort4.c new file mode 100644 index 0000000000..d5b8d05a91 --- /dev/null +++ b/stdlib/tst-qsort4.c @@ -0,0 +1,134 @@ +/* Test the heapsort implementation behind qsort. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "qsort.c" + +#include +#include +#include + +static int +cmp (const void *a1, const void *b1, void *closure) +{ + const signed char *a = a1; + const signed char *b = b1; + return *a - *b; +} + +/* Wrapper around heapsort_r that set ups the required variables. */ +static void +heapsort_wrapper (void *const pbase, size_t total_elems, size_t size, + __compar_d_fn_t cmp, void *arg) +{ + char *base_ptr = (char *) pbase; + char *lo = base_ptr; + char *hi = &lo[size * (total_elems - 1)]; + + if (total_elems <= 1) + /* Avoid lossage with unsigned arithmetic below. */ + return; + + enum swap_type_t swap_type; + if (is_aligned (pbase, size, 8)) + swap_type = SWAP_WORDS_64; + else if (is_aligned (pbase, size, 4)) + swap_type = SWAP_WORDS_32; + else + swap_type = SWAP_BYTES; + heapsort_r (lo, hi, size, swap_type, cmp, arg); +} + +static void +check_one_sort (signed char *array, int length) +{ + signed char *copy = xmalloc (length); + memcpy (copy, array, length); + heapsort_wrapper (copy, length, 1, cmp, NULL); + + /* Verify that the result is sorted. */ + for (int i = 1; i < length; ++i) + if (copy[i] < copy[i - 1]) + { + support_record_failure (); + printf ("error: sorting failure for length %d at offset %d\n", + length, i - 1); + printf ("input:"); + for (int i = 0; i < length; ++i) + printf (" %d", array[i]); + printf ("\noutput:"); + for (int i = 0; i < length; ++i) + printf (" %d", copy[i]); + putchar ('\n'); + break; + } + + /* Verify that no elements went away or were added. */ + { + int expected_counts[256]; + for (int i = 0; i < length; ++i) + ++expected_counts[array[i] & 0xff]; + int actual_counts[256]; + for (int i = 0; i < length; ++i) + ++actual_counts[copy[i] & 0xff]; + for (int i = 0; i < 256; ++i) + TEST_COMPARE (expected_counts[i], expected_counts[i]); + } + + free (copy); +} + +/* Enumerate all possible combinations of LENGTH elements. */ +static void +check_combinations (int length, signed char *start, int offset) +{ + if (offset == length) + check_one_sort (start, length); + else + for (int i = 0; i < length; ++i) + { + start[offset] = i; + check_combinations(length, start, offset + 1); + } +} + +static int +do_test (void) +{ + /* A random permutation of 20 values. */ + check_one_sort ((signed char[20]) {5, 12, 16, 10, 14, 11, 9, 13, 8, 15, + 0, 17, 3, 7, 1, 18, 2, 19, 4, 6}, 20); + + + /* A permutation that appeared during adversial testing for the + quicksort pass. */ + check_one_sort ((signed char[16]) {15, 3, 4, 2, 1, 0, 8, 7, 6, 5, 14, + 13, 12, 11, 10, 9}, 16); + + /* Array lengths 2 and less are not handled by heapsort_r and + deferred to insertion sort. */ + for (int i = 3; i <= 8; ++i) + { + signed char *buf = xmalloc (i); + check_combinations (i, buf, 0); + free (buf); + } + + return 0; +} + +#include From patchwork Fri Nov 17 18:44:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 80169 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CE31C385C6EF for ; Fri, 17 Nov 2023 18:44:40 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id A790A3857C41 for ; Fri, 17 Nov 2023 18:44:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A790A3857C41 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A790A3857C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700246667; cv=none; b=xcxusCZR7SVWPo1UyMzN51rSlewrpUwYkh62HWL6H6dhXU5gA6udVHP6mFYComRGd6GHVc7n0M7E3/m410IMjRwX+0fjk6ABWVHOKgGC+u6Wt6EpI7jLj0R1keoQ4PMDfB1tSQZPqKfUEV0opJl0zHWApflXtRU5FJ4uub2JyK0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700246667; c=relaxed/simple; bh=8R4Y1HXEWSg07M3kEhvtilNiXqPZlfBIPpSlqssVRPI=; h=DKIM-Signature:From:To:Subject:Message-ID:Date:MIME-Version; b=sg1w+EtiwyiZkkiJXDNQIQNRY62dDMazrrJeAeL44mUJh/iPQd3org3xFAqQTN0M7g6Y1+Ovo9VNInCvT8jVUGPs8BZ5kG+Tr2kEfuvA9nkoZzBGgacfBomeecLeUEJHcyuf/EOrLmlvFsfMjw9wXnpyUrR8l+QkdM+rNcOtLwc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700246664; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w9TiBOyoT6acXs74yA5pxgBD7uLdoROuyBKQFOu+g/Y=; b=OngV+r1q4hn952z54jfNagGkEn1mtYFhftp1KunrMOXMKLGlXQ5JlmQhdvI+WCX+FPpq4M 6Fb21NcP9b51cZLza76+UQX47g99mFKK1Ml3Vm9H7RnE+h7Z3uJl/ctUSfZuqUcwTDeI3t cgli95dcvSVDG51RV3JQ+vx9RpXn9rY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-441-9Fo_nssNMHGhhT5Ly1efug-1; Fri, 17 Nov 2023 13:44:21 -0500 X-MC-Unique: 9Fo_nssNMHGhhT5Ly1efug-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3E31D185A781 for ; Fri, 17 Nov 2023 18:44:21 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.2.16.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 85F2CC09A60 for ; Fri, 17 Nov 2023 18:44:20 +0000 (UTC) From: Florian Weimer To: libc-alpha@sourceware.org Subject: [PATCH 3/3] stdlib: The qsort implementation needs to use heapsort in more cases In-Reply-To: Message-ID: References: X-From-Line: dc4a606da98f281312bad7a1018a50a71a0d7def Mon Sep 17 00:00:00 2001 Date: Fri, 17 Nov 2023 19:44:19 +0100 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.3 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, URI_DOTEDU autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org The existing logic avoided internal stack overflow. To avoid a denial-of-service condition with adversarial input, it is necessary to fall over to heapsort if tail-recursing deeply, too, which does not result in a deep stack of pending partitions. The new test stdlib/tst-qsort5 is based on Douglas McIlroy's paper on this subject. Reviewed-by: Adhemerval Zanella --- stdlib/Makefile | 3 + stdlib/qsort.c | 17 +++-- stdlib/tst-qsort5.c | 171 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 187 insertions(+), 4 deletions(-) create mode 100644 stdlib/tst-qsort5.c diff --git a/stdlib/Makefile b/stdlib/Makefile index 48688f6a27..6194d1cb22 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -215,6 +215,7 @@ tests := \ tst-qsort \ tst-qsort2 \ tst-qsort3 \ + tst-qsort5 \ tst-quick_exit \ tst-rand48 \ tst-rand48-2 \ @@ -512,3 +513,5 @@ $(objpfx)tst-setcontext3.out: tst-setcontext3.sh $(objpfx)tst-setcontext3 '$(run-program-env)' '$(test-program-prefix-after-env)' \ $(common-objpfx)stdlib/; \ $(evaluate-test) + +$(objpfx)tst-qsort5: $(libm) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index a2f9e916ef..be01fb5598 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -389,14 +389,23 @@ __qsort_r (void *const pbase, size_t total_elems, size_t size, { if ((size_t) (hi - left_ptr) <= max_thresh) /* Ignore both small partitions. */ - top = pop (top, &lo, &hi, &depth); + { + top = pop (top, &lo, &hi, &depth); + --depth; + } else - /* Ignore small left partition. */ - lo = left_ptr; + { + /* Ignore small left partition. */ + lo = left_ptr; + --depth; + } } else if ((size_t) (hi - left_ptr) <= max_thresh) /* Ignore small right partition. */ - hi = right_ptr; + { + hi = right_ptr; + --depth; + } else if ((right_ptr - lo) > (hi - left_ptr)) { /* Push larger left partition indices. */ diff --git a/stdlib/tst-qsort5.c b/stdlib/tst-qsort5.c new file mode 100644 index 0000000000..d3a88c30f8 --- /dev/null +++ b/stdlib/tst-qsort5.c @@ -0,0 +1,171 @@ +/* Adversarial test for qsort_r. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* The approach follows Douglas McIlroy, A Killer Adversary for + Quicksort. Software—Practice and Experience 29 (1999) 341-344. + Downloaded + (2023-11-17). */ + +#include +#include +#include +#include +#include + +struct context +{ + /* Called the gas value in the paper. This value is larger than all + other values (length minus one will do), so comparison with any + decided value has a known result. */ + int undecided_value; + + /* If comparing undecided values, one of them as to be assigned a + value to ensure consistency with future comparisons. This is the + value that will be used. Starts out at zero. */ + int next_decided; + + /* Used to trick pivot selection. Deciding the value for the last + seen undcided value in a decided/undecided comparison happens + to trick the many qsort implementations. */ + int last_undecided_index; + + /* This array contains the actually asigned values. The call to + qsort_r sorts a different array that contains indices into this + array. */ + int *decided_values; +}; + +static int +compare_opponent (const void *l1, const void *r1, void *ctx1) +{ + const int *l = l1; + const int *r = r1; + struct context *ctx = ctx1; + int rvalue = ctx->decided_values[*r]; + int lvalue = ctx->decided_values[*l]; + + if (lvalue == ctx->undecided_value) + { + if (rvalue == ctx->undecided_value) + { + /* Both values are undecided. In this case, make a decision + for the last-used undecided value. This is tweak is very + specific to quicksort. */ + if (*l == ctx->last_undecided_index) + { + ctx->decided_values[*l] = ctx->next_decided; + ++ctx->next_decided; + /* The undecided value or *r is greater. */ + return -1; + } + else + { + ctx->decided_values[*r] = ctx->next_decided; + ++ctx->next_decided; + /* The undecided value for *l is greater. */ + return 1; + } + } + else + { + ctx->last_undecided_index = *l; + return 1; + } + } + else + { + /* *l is a decided value. */ + if (rvalue == ctx->undecided_value) + { + ctx->last_undecided_index = *r; + /* The undecided value for *r is greater. */ + return -1; + } + else + return lvalue - rvalue; + } +} + +/* Return a pointer to the adversarial permutation of length N. */ +static int * +create_permutation (size_t n) +{ + struct context ctx = + { + .undecided_value = n - 1, /* Larger than all other values. */ + .decided_values = xcalloc (n, sizeof (int)), + }; + for (size_t i = 0; i < n; ++i) + ctx.decided_values[i] = ctx.undecided_value; + int *scratch = xcalloc (n, sizeof (int)); + for (size_t i = 0; i < n; ++i) + scratch[i] = i; + qsort_r (scratch, n, sizeof (*scratch), compare_opponent, &ctx); + free (scratch); + return ctx.decided_values; +} + +/* Callback function for qsort which counts the number of invocations + in *CLOSURE. */ +static int +compare_counter (const void *l1, const void *r1, void *closure) +{ + const int *l = l1; + const int *r = r1; + unsigned long long int *counter = closure; + ++*counter; + return *l - *r; +} + +/* Count the comparisons required for an adversarial permutation of + length N. */ +static unsigned long long int +count_comparisons (size_t n) +{ + int *array = create_permutation (n); + unsigned long long int counter = 0; + qsort_r (array, n, sizeof (*array), compare_counter, &counter); + free (array); + return counter; +} + +/* Check the scaling factor for one adversarial permutation of length + N, and report some statistics. */ +static void +check_one_n (size_t n) +{ + unsigned long long int count = count_comparisons (n); + double factor = count / (n * log (count)); + printf ("info: length %zu: %llu comparisons ~ %f * n * log (n)\n", + n, count, factor); + /* This is an arbitrary factor which is true for the current + implementation across a wide range of sizes. */ + TEST_VERIFY (factor <= 4.5); +} + +static int +do_test (void) +{ + check_one_n (100); + check_one_n (1000); + for (int i = 1; i <= 15; ++i) + check_one_n (i * 10 * 1000); + return 0; +} + +#include