From patchwork Thu Jul 13 13:25:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 72643 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B301A3857345 for ; Thu, 13 Jul 2023 13:26:12 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B301A3857345 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1689254772; bh=SvK6NFF4hRo79ZiDq1aVi8spOv0exx8WrKnW9nJLY5A=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=uNJQghJDcGqv6oLmG2rBwf1b7yC/5QyCAOHTNMugGLus3bq/KQhbpj/8wR34KvcKi UY8Dq+jnMVskzYfFdnsiy8ePM2jUdIvminip24/Q2OXHqXAyp002+KS7jiAOj8QvkJ A0DvvHlre3g7lCktS36RHlJmwmKZbT7TSjyM2O6w= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc2a.google.com (mail-oo1-xc2a.google.com [IPv6:2607:f8b0:4864:20::c2a]) by sourceware.org (Postfix) with ESMTPS id 9FB583858C62 for ; Thu, 13 Jul 2023 13:25:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9FB583858C62 Received: by mail-oo1-xc2a.google.com with SMTP id 006d021491bc7-565a8d74daeso413711eaf.1 for ; Thu, 13 Jul 2023 06:25:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689254745; x=1691846745; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SvK6NFF4hRo79ZiDq1aVi8spOv0exx8WrKnW9nJLY5A=; b=YVpbwBXsmbs+2wpY8JNNDkcLvUJfpI+Sj7IUYZ7NfBPLMcbKKqxzTy2v7VhO6+buvz TlAQBYMfMBKLvFn9xoIbaG57KIY9Hq46bUlEwsWyVkQrbhNa8FQ0RalfqabSM4DhXGcm MgsNylp2MF1oduKr+hPAFPwRnZ4gGv0QNTCrEZ5HqsSm6UiNS+ukSWJsjDzo0SCvCplA 3mH6S+nPUBRStWv9ECAfso1D5FsXOGKX/6ZJzyiMuJEkLihR/klU730g3wgv0fhOqSw4 SazTgL2d9oWlnORL80E0/D4fRIzQUBQusfC4zf2GYteCYC14aC0vuviktFrgk7NgoJBT BeIg== X-Gm-Message-State: ABy/qLbTCNQ9XjXDrfvhipIBhf4R9EnumgNH9JYN9ptbzl4xKkRW1ofY oRslI+uDQAHi4eM1V+UIVOXqN3VtcRcqCKRDOAU3fg== X-Google-Smtp-Source: APBJJlF+eg30mQFZZhLqZJlzdDnRCsbkTrYY52SWKOphiTUFZPmLuIDWhHhLrNdR1HvhYkXAJDE5cw== X-Received: by 2002:a4a:8291:0:b0:564:e465:5d5c with SMTP id e17-20020a4a8291000000b00564e4655d5cmr1235014oog.2.1689254744993; Thu, 13 Jul 2023 06:25:44 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:5656:8d9b:78e7:d879:79cf]) by smtp.gmail.com with ESMTPSA id d195-20020a4a52cc000000b00541fbbbcd31sm2888856oob.5.2023.07.13.06.25.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 06:25:44 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v5 1/6] stdlib: Optimization qsort{_r} swap implementation Date: Thu, 13 Jul 2023 10:25:35 -0300 Message-Id: <20230713132540.2854320-2-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> References: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" It optimizes take in consideration both the most common elements are either 32 or 64 bit in size and inputs are aligned to the word boundary. This is similar to the optimization done on lib/sort.c from Linux. This patchs adds an optimized swap operation on qsort based in previous msort one. Instead of byte operation, three variants are provided: 1. Using uint32_t loads and stores. 2. Using uint64_t loads and stores. 3. Generic one with a temporary buffer and memcpy/mempcpy. The 1. and 2. options are selected only if base pointer is aligned to required type. Checked on x86_64-linux-gnu. --- stdlib/qsort.c | 109 +++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 91 insertions(+), 18 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 728a0ed370..bf3326e4b9 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -23,20 +23,85 @@ #include #include #include +#include -/* Byte-wise swap two items of size SIZE. */ -#define SWAP(a, b, size) \ - do \ - { \ - size_t __size = (size); \ - char *__a = (a), *__b = (b); \ - do \ - { \ - char __tmp = *__a; \ - *__a++ = *__b; \ - *__b++ = __tmp; \ - } while (--__size > 0); \ - } while (0) +/* Swap SIZE bytes between addresses A and B. These helpers are provided + along the generic one as an optimization. */ + +enum swap_type_t + { + SWAP_WORDS_64, + SWAP_WORDS_32, + SWAP_BYTES + }; + +/* Returns true if elements can be copied using word loads and stores. + The SIZE and BASE must be a multiple of the ALIGN. */ +__attribute_const__ __always_inline static bool +is_aligned (const void *base, size_t size, unsigned char align) +{ + return (((uintptr_t) base | size) & (align - 1)) == 0; +} + +static void +swap_words_64 (void * restrict a, void * restrict b, size_t n) +{ + do + { + n -= 8; + uint64_t t = *(uint64_t *)(a + n); + *(uint64_t *)(a + n) = *(uint64_t *)(b + n); + *(uint64_t *)(b + n) = t; + } while (n); +} + +static void +swap_words_32 (void * restrict a, void * restrict b, size_t n) +{ + do + { + n -= 4; + uint32_t t = *(uint32_t *)(a + n); + *(uint32_t *)(a + n) = *(uint32_t *)(b + n); + *(uint32_t *)(b + n) = t; + } while (n); +} + +static void +swap_bytes (void * restrict a, void * restrict b, size_t n) +{ + /* Use multiple small memcpys with constant size to enable inlining + on most targets. */ + enum { SWAP_GENERIC_SIZE = 32 }; + unsigned char tmp[SWAP_GENERIC_SIZE]; + while (n > SWAP_GENERIC_SIZE) + { + memcpy (tmp, a, SWAP_GENERIC_SIZE); + a = __mempcpy (a, b, SWAP_GENERIC_SIZE); + b = __mempcpy (b, tmp, SWAP_GENERIC_SIZE); + n -= SWAP_GENERIC_SIZE; + } + while (n > 0) + { + unsigned char t = ((unsigned char *)a)[--n]; + ((unsigned char *)a)[n] = ((unsigned char *)b)[n]; + ((unsigned char *)b)[n] = t; + } +} + +/* Replace the indirect call with a serie of if statements. It should help + the branch predictor. */ +static void +do_swap (void * restrict a, void * restrict b, size_t size, + enum swap_type_t swap_func) +{ + if (swap_func == SWAP_WORDS_64) + swap_words_64 (a, b, size); + else if (swap_func == SWAP_WORDS_32) + swap_words_32 (a, b, size); + else + swap_bytes (a, b, size); +} /* Discontinue quicksort algorithm when partition gets below this size. This particular magic number was chosen to work best on a Sun 4/260. */ @@ -96,6 +161,14 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, /* Avoid lossage with unsigned arithmetic below. */ return; + enum swap_type_t swap_type; + if (is_aligned (pbase, size, 8)) + swap_type = SWAP_WORDS_64; + else if (is_aligned (pbase, size, 4)) + swap_type = SWAP_WORDS_32; + else + swap_type = SWAP_BYTES; + if (total_elems > MAX_THRESH) { char *lo = base_ptr; @@ -119,13 +192,13 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, char *mid = lo + size * ((hi - lo) / size >> 1); if ((*cmp) ((void *) mid, (void *) lo, arg) < 0) - SWAP (mid, lo, size); + do_swap (mid, lo, size, swap_type); if ((*cmp) ((void *) hi, (void *) mid, arg) < 0) - SWAP (mid, hi, size); + do_swap (mid, hi, size, swap_type); else goto jump_over; if ((*cmp) ((void *) mid, (void *) lo, arg) < 0) - SWAP (mid, lo, size); + do_swap (mid, lo, size, swap_type); jump_over:; left_ptr = lo + size; @@ -144,7 +217,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, if (left_ptr < right_ptr) { - SWAP (left_ptr, right_ptr, size); + do_swap (left_ptr, right_ptr, size, swap_type); if (mid == left_ptr) mid = right_ptr; else if (mid == right_ptr) @@ -216,7 +289,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, tmp_ptr = run_ptr; if (tmp_ptr != base_ptr) - SWAP (tmp_ptr, base_ptr, size); + do_swap (tmp_ptr, base_ptr, size, swap_type); /* Insertion sort, running from left-hand-side up to right-hand-side. */ From patchwork Thu Jul 13 13:25:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 72646 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 69CA43852649 for ; Thu, 13 Jul 2023 13:26:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 69CA43852649 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1689254819; bh=q41TpCNKw5FqcGvI6dpTutH1CAn5jxmb3xpnBpufo3E=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=dd0gA01jIbmaSGAuOr6whu9VuA7aDNpDRUFiq9HzMaQ5L5Z9iZ31LtoO0nTaOBXEH zN4XIxkxUUUHJqC2bZhgHwFYJ9hPkeARYOz2vOOVT2LeFwFuftbzKRx5YjJLzOuwtK sbb7FvbKOnnRku16ntTuQM3GEViV1OGbj51fqHhs= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x30.google.com (mail-oa1-x30.google.com [IPv6:2001:4860:4864:20::30]) by sourceware.org (Postfix) with ESMTPS id DE5B53858C74 for ; Thu, 13 Jul 2023 13:25:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DE5B53858C74 Received: by mail-oa1-x30.google.com with SMTP id 586e51a60fabf-1b056276889so555035fac.2 for ; Thu, 13 Jul 2023 06:25:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689254746; x=1691846746; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=q41TpCNKw5FqcGvI6dpTutH1CAn5jxmb3xpnBpufo3E=; b=h4buwLJ9CL+z1O5TXyc6RN0OqgqLbrf+VhOlku50xQd+sZMrervSxuLpJbrL2hK5oA T5qmup1bcFhdRpkqmrNElk/4Wp/VJsdLme3Ak1Jh6+t7CfTOvY2uRaXuMRzOjAcupUYw 4JYpR+4kzcSFMLYaynaawQcXyrGnHYkBMgAbkUYwB6bBuPuf+0/jb/WYStYJGRirqKC+ yyaZQOziNAAdX3fWda7Ef+xVPe9f+4FOsVun5+3tQyniBAOwogj36RTsRdU2Ofj22vAg UELVHE7uE93M8SYOkkAP5w14Ey20PtjS/++wZfKCyGlPVNUk3OGYk8ChexuLezTt2yIL C7nw== X-Gm-Message-State: ABy/qLaJwluqFTfJr2DrelykE8/zr9/EqjnCtitrJCZU6u67Q6xurwx9 3zSaUP6/R/iqyMxbv72JUN03utaCAL8htCc7OSm1MQ== X-Google-Smtp-Source: APBJJlGEf4/xQV2aAUScUfnhpR/plhWfL27DF5eWFb1NtWVlMppbZPV5FHVdu9RauJLYme5ZJ6z6XA== X-Received: by 2002:a05:6870:960d:b0:1b0:3a8c:a807 with SMTP id d13-20020a056870960d00b001b03a8ca807mr2706375oaq.14.1689254746288; Thu, 13 Jul 2023 06:25:46 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:5656:8d9b:78e7:d879:79cf]) by smtp.gmail.com with ESMTPSA id d195-20020a4a52cc000000b00541fbbbcd31sm2888856oob.5.2023.07.13.06.25.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 06:25:45 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v5 2/6] stdlib: Move insertion sort out qsort Date: Thu, 13 Jul 2023 10:25:36 -0300 Message-Id: <20230713132540.2854320-3-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> References: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" --- stdlib/qsort.c | 101 ++++++++++++++++++++++++++----------------------- 1 file changed, 54 insertions(+), 47 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index bf3326e4b9..410a8d7731 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -149,6 +149,58 @@ typedef struct smaller partition. This *guarantees* no more than log (total_elems) stack size is needed (actually O(1) in this case)! */ +static inline void +insertion_sort_qsort_partitions (void *const pbase, size_t total_elems, + size_t size, enum swap_type_t swap_func, + __compar_d_fn_t cmp, void *arg) +{ + char *base_ptr = (char *) pbase; + char *const end_ptr = &base_ptr[size * (total_elems - 1)]; + char *tmp_ptr = base_ptr; +#define min(x, y) ((x) < (y) ? (x) : (y)) + const size_t max_thresh = MAX_THRESH * size; + char *thresh = min(end_ptr, base_ptr + max_thresh); + char *run_ptr; + + /* Find smallest element in first threshold and place it at the + array's beginning. This is the smallest array element, + and the operation speeds up insertion sort's inner loop. */ + + for (run_ptr = tmp_ptr + size; run_ptr <= thresh; run_ptr += size) + if (cmp (run_ptr, tmp_ptr, arg) < 0) + tmp_ptr = run_ptr; + + if (tmp_ptr != base_ptr) + do_swap (tmp_ptr, base_ptr, size, swap_func); + + /* Insertion sort, running from left-hand-side up to right-hand-side. */ + + run_ptr = base_ptr + size; + while ((run_ptr += size) <= end_ptr) + { + tmp_ptr = run_ptr - size; + while (cmp (run_ptr, tmp_ptr, arg) < 0) + tmp_ptr -= size; + + tmp_ptr += size; + if (tmp_ptr != run_ptr) + { + char *trav; + + trav = run_ptr + size; + while (--trav >= run_ptr) + { + char c = *trav; + char *hi, *lo; + + for (hi = lo = trav; (lo -= size) >= tmp_ptr; hi = lo) + *hi = *lo; + *hi = c; + } + } + } +} + void _quicksort (void *const pbase, size_t total_elems, size_t size, __compar_d_fn_t cmp, void *arg) @@ -271,51 +323,6 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, for partitions below MAX_THRESH size. BASE_PTR points to the beginning of the array to sort, and END_PTR points at the very last element in the array (*not* one beyond it!). */ - -#define min(x, y) ((x) < (y) ? (x) : (y)) - - { - char *const end_ptr = &base_ptr[size * (total_elems - 1)]; - char *tmp_ptr = base_ptr; - char *thresh = min(end_ptr, base_ptr + max_thresh); - char *run_ptr; - - /* Find smallest element in first threshold and place it at the - array's beginning. This is the smallest array element, - and the operation speeds up insertion sort's inner loop. */ - - for (run_ptr = tmp_ptr + size; run_ptr <= thresh; run_ptr += size) - if ((*cmp) ((void *) run_ptr, (void *) tmp_ptr, arg) < 0) - tmp_ptr = run_ptr; - - if (tmp_ptr != base_ptr) - do_swap (tmp_ptr, base_ptr, size, swap_type); - - /* Insertion sort, running from left-hand-side up to right-hand-side. */ - - run_ptr = base_ptr + size; - while ((run_ptr += size) <= end_ptr) - { - tmp_ptr = run_ptr - size; - while ((*cmp) ((void *) run_ptr, (void *) tmp_ptr, arg) < 0) - tmp_ptr -= size; - - tmp_ptr += size; - if (tmp_ptr != run_ptr) - { - char *trav; - - trav = run_ptr + size; - while (--trav >= run_ptr) - { - char c = *trav; - char *hi, *lo; - - for (hi = lo = trav; (lo -= size) >= tmp_ptr; hi = lo) - *hi = *lo; - *hi = c; - } - } - } - } + insertion_sort_qsort_partitions (pbase, total_elems, size, swap_type, cmp, + arg); } From patchwork Thu Jul 13 13:25:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 72647 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5FA553854E89 for ; Thu, 13 Jul 2023 13:27:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5FA553854E89 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1689254862; bh=ikkUZT/PzqP9Wel3HMCh5l5zlqVta5aARi+ES/ylbRQ=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=nzZlccf7HA3fWlk/Qwtv8PzgLmxY/0E2zTIhzthPbcnW6rWCv25mqW0dxV94bR7ZM a4ncqE64SIHnvYBSOhmo8ovwkSQ4RFMy42Gjjb8IyS9AmCwRxpp/1E4ghnrAtK1tWU io5AiQ73dof1Kpuk5bLiSyBRO/RI2ov30zPE8IOo= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc2f.google.com (mail-oo1-xc2f.google.com [IPv6:2607:f8b0:4864:20::c2f]) by sourceware.org (Postfix) with ESMTPS id DCAA53858C00 for ; Thu, 13 Jul 2023 13:25:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DCAA53858C00 Received: by mail-oo1-xc2f.google.com with SMTP id 006d021491bc7-56598263d1dso544288eaf.0 for ; Thu, 13 Jul 2023 06:25:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689254747; x=1691846747; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ikkUZT/PzqP9Wel3HMCh5l5zlqVta5aARi+ES/ylbRQ=; b=eSVGRx+dFmYTMFm3wQh4+gBD4H5nICylQgeFN5cMYy9soEHs0rBGxV7PWXadIxwkbl JoMXVRIborKR5AP5JCleFbtLuNs9eg23vNHwE371a/qey74kodZqTTu3yIUh/cdMv1LY yBT+fhvtF5qmM6FrD3GGPANmHCyd0PpiXWwUEvqrshBF6t9FkJd2T/+o6h61Pf43sUea +/+9J5sFYT0O2OzrW+Z+zGgNarHK5AZ4wo9Dn74l+ur/XKIYhGrZK5xCbVDzBmxGqHjm d+WfmYD1D3L4OBRrwNAfsfBcfbUFkTgafxv5NOSmhrYQ7Z05ZGxzhOcDJZ3k1e2zqDwK avxA== X-Gm-Message-State: ABy/qLaAYWTkZcWhgAAscjROprU+695rm6T2c88HFzhGX/0OeHUu+E16 7Gu5xe0JmlUru60eajRZ4zxKMImgghgcW6CUPkfgeg== X-Google-Smtp-Source: APBJJlFduUnkNemnyuVEE3jwkmrumPoyy36LzfAGLszR60hW10l+ZwWj1tCjA6NNozvVzIhppvkqew== X-Received: by 2002:a4a:8101:0:b0:566:97a9:b73b with SMTP id b1-20020a4a8101000000b0056697a9b73bmr1187732oog.2.1689254747634; Thu, 13 Jul 2023 06:25:47 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:5656:8d9b:78e7:d879:79cf]) by smtp.gmail.com with ESMTPSA id d195-20020a4a52cc000000b00541fbbbcd31sm2888856oob.5.2023.07.13.06.25.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 06:25:47 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v5 3/6] stdlib: qsort: Move some macros to inline function Date: Thu, 13 Jul 2023 10:25:37 -0300 Message-Id: <20230713132540.2854320-4-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> References: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" --- stdlib/qsort.c | 35 +++++++++++++++++++++++------------ 1 file changed, 23 insertions(+), 12 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 410a8d7731..53013ef281 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -114,15 +114,28 @@ typedef struct char *hi; } stack_node; -/* The next 4 #defines implement a very fast in-line stack abstraction. */ /* The stack needs log (total_elements) entries (we could even subtract log(MAX_THRESH)). Since total_elements has type size_t, we get as upper bound for log (total_elements): bits per byte (CHAR_BIT) * sizeof(size_t). */ -#define STACK_SIZE (CHAR_BIT * sizeof (size_t)) -#define PUSH(low, high) ((void) ((top->lo = (low)), (top->hi = (high)), ++top)) -#define POP(low, high) ((void) (--top, (low = top->lo), (high = top->hi))) -#define STACK_NOT_EMPTY (stack < top) +enum { STACK_SIZE = CHAR_BIT * sizeof (size_t) }; + +static inline stack_node * +push (stack_node *top, char *lo, char *hi) +{ + top->lo = lo; + top->hi = hi; + return ++top; +} + +static inline stack_node * +pop (stack_node *top, char **lo, char **hi) +{ + --top; + *lo = top->lo; + *hi = top->hi; + return top; +} /* Order size using quicksort. This implementation incorporates @@ -226,11 +239,9 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, char *lo = base_ptr; char *hi = &lo[size * (total_elems - 1)]; stack_node stack[STACK_SIZE]; - stack_node *top = stack; - - PUSH (NULL, NULL); + stack_node *top = stack + 1; - while (STACK_NOT_EMPTY) + while (stack < top) { char *left_ptr; char *right_ptr; @@ -295,7 +306,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, { if ((size_t) (hi - left_ptr) <= max_thresh) /* Ignore both small partitions. */ - POP (lo, hi); + top = pop (top, &lo, &hi); else /* Ignore small left partition. */ lo = left_ptr; @@ -306,13 +317,13 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, else if ((right_ptr - lo) > (hi - left_ptr)) { /* Push larger left partition indices. */ - PUSH (lo, right_ptr); + top = push (top, lo, right_ptr); lo = left_ptr; } else { /* Push larger right partition indices. */ - PUSH (left_ptr, hi); + top = push (top, left_ptr, hi); hi = right_ptr; } } From patchwork Thu Jul 13 13:25:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 72645 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9DF683855586 for ; Thu, 13 Jul 2023 13:26:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9DF683855586 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1689254818; bh=X2cZ5QkfNO8SrXv4Sffjx4imKBfcBdOLxkqgxqAQ7A8=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Q3voG3y1qDsnBw4oR9tSHMc4vNk2CXJQ1CL5iMEEd3SNWgz8vJVpWPh658JXSOy/8 CnkILW+ceKWfir2vBYkpFEtwT3vrvi0nMAHmUKmBnHPkFmO64FkgITC1zHfqPsTvnv WwQXZsQbZ8zIou2mjnIAqNv8SzjCq0SvIHWfBOF0= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc34.google.com (mail-oo1-xc34.google.com [IPv6:2607:f8b0:4864:20::c34]) by sourceware.org (Postfix) with ESMTPS id 7404E3858280 for ; Thu, 13 Jul 2023 13:25:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7404E3858280 Received: by mail-oo1-xc34.google.com with SMTP id 006d021491bc7-566e793e0a0so532408eaf.2 for ; Thu, 13 Jul 2023 06:25:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689254749; x=1691846749; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=X2cZ5QkfNO8SrXv4Sffjx4imKBfcBdOLxkqgxqAQ7A8=; b=Eyq9nJ1Czqpj98BTzKhf6WpELfDOYzAWZChhqVFhziNPRCxRWllWWPJAStxtNmt/Xd JTI1Aeua2BR8wdvGSNb9AZLHbg84OYYFeRvffJQZ1EIznG+evbH/s4R+Fc1LncE8h8MK xzFrvbine+joBH0rfMWpvctH29uwYoFgGuI8biKsfDUgsrpeD6tb9+mw6flSVG4n6hM7 oTFbEnYq6rVQx96LzYiBZ938rJRIMp4ShGYBOiSWgaFJb7i5XHAtr5X1y+KoF7r4lTlw UEtCsTXeeyXKBAFcquDBVYnaXxWcUspzv3f2r36h+z/6EMIUzh7OW+Jhff7vKH7r7rAb UFDQ== X-Gm-Message-State: ABy/qLY9IGQ/OBBRqnOvxdUO1Zm9MNklbOftLAArWAjINTByBDsTgiC8 w8aI0Nyk9six/bTZiuRpotrYK36Imi00KYJzTLb0vQ== X-Google-Smtp-Source: APBJJlHLLZlke6zGbWWFnLRiYWjrpI8viDBv60w/LUBXY7hsKvT9crT6r0g3gAj1fEJobPNZ/4HIDg== X-Received: by 2002:a4a:840c:0:b0:566:fafc:ce47 with SMTP id l12-20020a4a840c000000b00566fafcce47mr1282677oog.8.1689254748961; Thu, 13 Jul 2023 06:25:48 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:5656:8d9b:78e7:d879:79cf]) by smtp.gmail.com with ESMTPSA id d195-20020a4a52cc000000b00541fbbbcd31sm2888856oob.5.2023.07.13.06.25.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 06:25:48 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v5 4/6] stdlib: Implement introsort with qsort (BZ 19305) Date: Thu, 13 Jul 2023 10:25:38 -0300 Message-Id: <20230713132540.2854320-5-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> References: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" This patch adds a introsort implementation on qsort to avoid worse-case performance of quicksort to O(nlog n). The heapsort fallback used is a heapsort based on Linux implementation (commit 22a241ccb2c19962a). As a side note the introsort implementation is similar the one used on libstdc++ for std::sort. Checked on x86_64-linux-gnu. --- stdlib/qsort.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 86 insertions(+), 6 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 53013ef281..ed0ac53467 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -112,6 +112,7 @@ typedef struct { char *lo; char *hi; + size_t depth; } stack_node; /* The stack needs log (total_elements) entries (we could even subtract @@ -121,22 +122,90 @@ typedef struct enum { STACK_SIZE = CHAR_BIT * sizeof (size_t) }; static inline stack_node * -push (stack_node *top, char *lo, char *hi) +push (stack_node *top, char *lo, char *hi, size_t depth) { top->lo = lo; top->hi = hi; + top->depth = depth; return ++top; } static inline stack_node * -pop (stack_node *top, char **lo, char **hi) +pop (stack_node *top, char **lo, char **hi, size_t *depth) { --top; *lo = top->lo; *hi = top->hi; + *depth = top->depth; return top; } +/* A fast, small, non-recursive O(nlog n) heapsort, adapted from Linux + lib/sort.c. Used on introsort implementation as a fallback routine with + worst-case performance of O(nlog n) and worst-case space complexity of + O(1). */ + +static inline size_t +parent (size_t i, unsigned int lsbit, size_t size) +{ + i -= size; + i -= size & -(i & lsbit); /* i.e., 'if (i & lsbit) i -= size;' */ + return i / 2; +} + +static void +heapsort_r (void *base, void *end, size_t size, enum swap_type_t swap_type, + __compar_d_fn_t cmp, void *arg) +{ + size_t num = ((uintptr_t) end - (uintptr_t) base) / size; + size_t n = num * size, a = (num/2) * size; + /* Used to find parent */ + const size_t lsbit = size & -size; + + /* num < 2 || size == 0. */ + if (a == 0) + return; + + for (;;) + { + size_t b, c, d; + + if (a != 0) + /* Building heap: sift down --a */ + a -= size; + else if (n -= size) + /* Sorting: Extract root to --n */ + do_swap (base, base + n, size, swap_type); + else + break; + + /* Sift element at "a" down into heap. This is the "bottom-up" variant, + which significantly reduces calls to cmp_func(): we find the + sift-down path all the way to the leaves (one compare per level), + then backtrack to find where to insert the target element. + + Because elements tend to sift down close to the leaves, this uses + fewer compares than doing two per level on the way down. (A bit more + than half as many on average, 3/4 worst-case.). */ + for (b = a; c = 2 * b + size, (d = c + size) < n;) + b = cmp (base + c, base + d, arg) >= 0 ? c : d; + if (d == n) + /* Special case last leaf with no sibling. */ + b = c; + + /* Now backtrack from "b" to the correct location for "a". */ + while (b != a && cmp (base + a, base + b, arg) >= 0) + b = parent (b, lsbit, size); + /* Where "a" belongs. */ + c = b; + while (b != a) + { + /* Shift it into place. */ + b = parent (b, lsbit, size); + do_swap (base + b, base + c, size, swap_type); + } + } +} /* Order size using quicksort. This implementation incorporates four optimizations discussed in Sedgewick: @@ -222,7 +291,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, const size_t max_thresh = MAX_THRESH * size; - if (total_elems == 0) + if (total_elems <= 1) /* Avoid lossage with unsigned arithmetic below. */ return; @@ -234,6 +303,10 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, else swap_type = SWAP_BYTES; + /* Maximum depth before quicksort switches to heapsort. */ + size_t depth = 2 * (sizeof (size_t) * CHAR_BIT - 1 + - __builtin_clzl (total_elems)); + if (total_elems > MAX_THRESH) { char *lo = base_ptr; @@ -243,6 +316,13 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, while (stack < top) { + if (depth == 0) + { + heapsort_r (lo, hi, size, swap_type, cmp, arg); + top = pop (top, &lo, &hi, &depth); + continue; + } + char *left_ptr; char *right_ptr; @@ -306,7 +386,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, { if ((size_t) (hi - left_ptr) <= max_thresh) /* Ignore both small partitions. */ - top = pop (top, &lo, &hi); + top = pop (top, &lo, &hi, &depth); else /* Ignore small left partition. */ lo = left_ptr; @@ -317,13 +397,13 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, else if ((right_ptr - lo) > (hi - left_ptr)) { /* Push larger left partition indices. */ - top = push (top, lo, right_ptr); + top = push (top, lo, right_ptr, depth - 1); lo = left_ptr; } else { /* Push larger right partition indices. */ - top = push (top, left_ptr, hi); + top = push (top, left_ptr, hi, depth - 1); hi = right_ptr; } } From patchwork Thu Jul 13 13:25:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 72648 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BC632384C6BC for ; Thu, 13 Jul 2023 13:27:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BC632384C6BC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1689254862; bh=s57UuwHKrfVTwlOUzBreBNWX5Exu+D284kfC293qTIE=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=UruPF0ci88Stduqk0Syguq+U5DgIuU3X7gFzfKeq6tmrPycph8EaXKMK2JUoPvGDF QZ12QQ5wWGGE+oWSBdpdcbGOHQ2X///BtCW83Hoe+Huy7gF28NvUHualr9YKRgLEEt bEWor5sZRQZFyF7a7xjaQPSaJLcADfKy+OAyLag4= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x33.google.com (mail-oa1-x33.google.com [IPv6:2001:4860:4864:20::33]) by sourceware.org (Postfix) with ESMTPS id DBC9A3858017 for ; Thu, 13 Jul 2023 13:25:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DBC9A3858017 Received: by mail-oa1-x33.google.com with SMTP id 586e51a60fabf-1b3c503af99so590359fac.0 for ; Thu, 13 Jul 2023 06:25:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689254750; x=1691846750; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=s57UuwHKrfVTwlOUzBreBNWX5Exu+D284kfC293qTIE=; b=NkdzYEXnN/O8YY+l/1S+jTj7Nbh2T7WVJlbasYZs3qKtq6mkJ4o4oBJNXxUPR9UqFS vpCiaHwRzGKuTQEz5Lq+AF+fupgulWtCn7WqQWS+t2GFAwKOQnTePpGDn9/g1EcrRQcX Pbbu7+S6tPSXUtLQBA/ug4CerLTC5eoOzcntDlTCq7zuW6UR9805o+QQwQvQ7vPxHbJd LIEn9FVgVF6kuZ6kCLb1AxXjLa8vJ9o7qgTWx+WzCCCDVH6Q3Ke0fhDsczK0TCkZUMOz AK63TtGk9uef9IXhXhn49LH4W/mIBGDzOxoCXXQVOw453FP6MlCHK2iwiMKp1be99brF o5ew== X-Gm-Message-State: ABy/qLYR264eKYs0WcUqrbqxorz/7tSz+tPq2gOlC+sRLp840rouORAA y9S/B135k/iO2KX77Bazo3GWIuQ8IycBzQVFRC7Evw== X-Google-Smtp-Source: APBJJlEI2DeDyq2919D4LzpT/sfgjSAPojSyV2NezzVjczNpyDTo5W+2zaoj5tGuwXyc3SaD63e5bw== X-Received: by 2002:a05:6870:798:b0:1b7:5ee0:bd4f with SMTP id en24-20020a056870079800b001b75ee0bd4fmr2014914oab.13.1689254750377; Thu, 13 Jul 2023 06:25:50 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:5656:8d9b:78e7:d879:79cf]) by smtp.gmail.com with ESMTPSA id d195-20020a4a52cc000000b00541fbbbcd31sm2888856oob.5.2023.07.13.06.25.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 06:25:49 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v5 5/6] stdlib: Remove use of mergesort on qsort (BZ 21719) Date: Thu, 13 Jul 2023 10:25:39 -0300 Message-Id: <20230713132540.2854320-6-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> References: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" This patch removes the mergesort optimization on qsort implementation and uses the introsort instead. The mergesort implementation has some issues: - It is as-safe only for certain types sizes (if total size is less than 1 KB with large element sizes also forcing memory allocation) which contradicts the function documentation. Although not required by the C standard, it is preferable and doable to have an O(1) space implementation. - The malloc for certain element size and element number adds arbitrary latency (might even be worse if malloc is interposed). - To avoid trigger swap from memory allocation the implementation relies on system information that might be virtualized (for instance VMs with overcommit memory) which might lead to potentially use of swap even if system advertise more memory than actually has. The check also have the downside of issuing syscalls where none is expected (although only once per execution). - The mergesort is suboptimal on an already sorted array (BZ#21719). The introsort implementation is already optimized to use constant extra space (due to the limit of total number of elements from maximum VM size) and thus can be used to avoid the malloc usage issues. Resulting performance is slower due the usage of qsort, specially in the worst-case scenario (partialy or sorted arrays) and due the fact mergesort uses a slight improved swap operations. This change also renders the BZ#21719 fix unrequired (since it is meant to fix the sorted input performance degradation for mergesort). The manual is also updated to indicate the function is now async-cancel safe. Checked on x86_64-linux-gnu. --- include/stdlib.h | 2 - manual/argp.texi | 2 +- manual/locale.texi | 3 +- manual/search.texi | 7 +- stdlib/Makefile | 2 - stdlib/msort.c | 309 --------------------------------------------- stdlib/qsort.c | 14 +- 7 files changed, 16 insertions(+), 323 deletions(-) delete mode 100644 stdlib/msort.c diff --git a/include/stdlib.h b/include/stdlib.h index 7deb8193d7..7b421f1bb3 100644 --- a/include/stdlib.h +++ b/include/stdlib.h @@ -147,8 +147,6 @@ extern int __posix_openpt (int __oflag) attribute_hidden; extern int __add_to_environ (const char *name, const char *value, const char *combines, int replace) attribute_hidden; -extern void _quicksort (void *const pbase, size_t total_elems, - size_t size, __compar_d_fn_t cmp, void *arg); extern int __on_exit (void (*__func) (int __status, void *__arg), void *__arg); diff --git a/manual/argp.texi b/manual/argp.texi index 0023441812..b77ad68285 100644 --- a/manual/argp.texi +++ b/manual/argp.texi @@ -735,7 +735,7 @@ for options, bad phase of the moon, etc. @c hol_set_group ok @c hol_find_entry ok @c hol_sort @mtslocale @acucorrupt -@c qsort dup @acucorrupt +@c qsort dup @c hol_entry_qcmp @mtslocale @c hol_entry_cmp @mtslocale @c group_cmp ok diff --git a/manual/locale.texi b/manual/locale.texi index 720e0ca952..f6afa5dc44 100644 --- a/manual/locale.texi +++ b/manual/locale.texi @@ -253,7 +253,7 @@ The symbols in this section are defined in the header file @file{locale.h}. @c calculate_head_size ok @c __munmap ok @c compute_hashval ok -@c qsort dup @acucorrupt +@c qsort dup @c rangecmp ok @c malloc @ascuheap @acsmem @c strdup @ascuheap @acsmem @@ -275,7 +275,6 @@ The symbols in this section are defined in the header file @file{locale.h}. @c realloc @ascuheap @acsmem @c realloc @ascuheap @acsmem @c fclose @ascuheap @asulock @acsmem @acsfd @aculock -@c qsort @ascuheap @acsmem @c alias_compare dup @c libc_lock_unlock @aculock @c _nl_explode_name @ascuheap @acsmem diff --git a/manual/search.texi b/manual/search.texi index 5691bf2f2b..a550858478 100644 --- a/manual/search.texi +++ b/manual/search.texi @@ -159,7 +159,7 @@ To sort an array using an arbitrary comparison function, use the @deftypefun void qsort (void *@var{array}, size_t @var{count}, size_t @var{size}, comparison_fn_t @var{compare}) @standards{ISO, stdlib.h} -@safety{@prelim{}@mtsafe{}@assafe{}@acunsafe{@acucorrupt{}}} +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} The @code{qsort} function sorts the array @var{array}. The array contains @var{count} elements, each of which is of size @var{size}. @@ -199,9 +199,8 @@ Functions}): The @code{qsort} function derives its name from the fact that it was originally implemented using the ``quick sort'' algorithm. -The implementation of @code{qsort} in this library might not be an -in-place sort and might thereby use an extra amount of memory to store -the array. +The implementation of @code{qsort} in this library is an in-place sort +and uses a constant extra space (allocated on the stack). @end deftypefun @node Search/Sort Example diff --git a/stdlib/Makefile b/stdlib/Makefile index 25e42a77e7..095518eef4 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -96,7 +96,6 @@ routines := \ mbtowc \ mrand48 \ mrand48_r \ - msort \ nrand48 \ nrand48_r \ old_atexit \ @@ -380,7 +379,6 @@ generated += \ # generated CFLAGS-bsearch.c += $(uses-callbacks) -CFLAGS-msort.c += $(uses-callbacks) CFLAGS-qsort.c += $(uses-callbacks) CFLAGS-system.c += -fexceptions CFLAGS-system.os = -fomit-frame-pointer diff --git a/stdlib/msort.c b/stdlib/msort.c deleted file mode 100644 index bbaa5e9f82..0000000000 --- a/stdlib/msort.c +++ /dev/null @@ -1,309 +0,0 @@ -/* An alternative to qsort, with an identical interface. - This file is part of the GNU C Library. - Copyright (C) 1992-2023 Free Software Foundation, Inc. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include -#include -#include -#include -#include -#include -#include - -struct msort_param -{ - size_t s; - size_t var; - __compar_d_fn_t cmp; - void *arg; - char *t; -}; -static void msort_with_tmp (const struct msort_param *p, void *b, size_t n); - -static void -msort_with_tmp (const struct msort_param *p, void *b, size_t n) -{ - char *b1, *b2; - size_t n1, n2; - - if (n <= 1) - return; - - n1 = n / 2; - n2 = n - n1; - b1 = b; - b2 = (char *) b + (n1 * p->s); - - msort_with_tmp (p, b1, n1); - msort_with_tmp (p, b2, n2); - - char *tmp = p->t; - const size_t s = p->s; - __compar_d_fn_t cmp = p->cmp; - void *arg = p->arg; - switch (p->var) - { - case 0: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (b1, b2, arg) <= 0) - { - *(uint32_t *) tmp = *(uint32_t *) b1; - b1 += sizeof (uint32_t); - --n1; - } - else - { - *(uint32_t *) tmp = *(uint32_t *) b2; - b2 += sizeof (uint32_t); - --n2; - } - tmp += sizeof (uint32_t); - } - break; - case 1: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (b1, b2, arg) <= 0) - { - *(uint64_t *) tmp = *(uint64_t *) b1; - b1 += sizeof (uint64_t); - --n1; - } - else - { - *(uint64_t *) tmp = *(uint64_t *) b2; - b2 += sizeof (uint64_t); - --n2; - } - tmp += sizeof (uint64_t); - } - break; - case 2: - while (n1 > 0 && n2 > 0) - { - unsigned long *tmpl = (unsigned long *) tmp; - unsigned long *bl; - - tmp += s; - if ((*cmp) (b1, b2, arg) <= 0) - { - bl = (unsigned long *) b1; - b1 += s; - --n1; - } - else - { - bl = (unsigned long *) b2; - b2 += s; - --n2; - } - while (tmpl < (unsigned long *) tmp) - *tmpl++ = *bl++; - } - break; - case 3: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (*(const void **) b1, *(const void **) b2, arg) <= 0) - { - *(void **) tmp = *(void **) b1; - b1 += sizeof (void *); - --n1; - } - else - { - *(void **) tmp = *(void **) b2; - b2 += sizeof (void *); - --n2; - } - tmp += sizeof (void *); - } - break; - default: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (b1, b2, arg) <= 0) - { - tmp = (char *) __mempcpy (tmp, b1, s); - b1 += s; - --n1; - } - else - { - tmp = (char *) __mempcpy (tmp, b2, s); - b2 += s; - --n2; - } - } - break; - } - - if (n1 > 0) - memcpy (tmp, b1, n1 * s); - memcpy (b, p->t, (n - n2) * s); -} - - -void -__qsort_r (void *b, size_t n, size_t s, __compar_d_fn_t cmp, void *arg) -{ - size_t size = n * s; - char *tmp = NULL; - struct msort_param p; - - /* For large object sizes use indirect sorting. */ - if (s > 32) - size = 2 * n * sizeof (void *) + s; - - if (size < 1024) - /* The temporary array is small, so put it on the stack. */ - p.t = __alloca (size); - else - { - /* We should avoid allocating too much memory since this might - have to be backed up by swap space. */ - static long int phys_pages; - static int pagesize; - - if (pagesize == 0) - { - phys_pages = __sysconf (_SC_PHYS_PAGES); - - if (phys_pages == -1) - /* Error while determining the memory size. So let's - assume there is enough memory. Otherwise the - implementer should provide a complete implementation of - the `sysconf' function. */ - phys_pages = (long int) (~0ul >> 1); - - /* The following determines that we will never use more than - a quarter of the physical memory. */ - phys_pages /= 4; - - /* Make sure phys_pages is written to memory. */ - atomic_write_barrier (); - - pagesize = __sysconf (_SC_PAGESIZE); - } - - /* Just a comment here. We cannot compute - phys_pages * pagesize - and compare the needed amount of memory against this value. - The problem is that some systems might have more physical - memory then can be represented with a `size_t' value (when - measured in bytes. */ - - /* If the memory requirements are too high don't allocate memory. */ - if (size / pagesize > (size_t) phys_pages) - { - _quicksort (b, n, s, cmp, arg); - return; - } - - /* It's somewhat large, so malloc it. */ - int save = errno; - tmp = malloc (size); - __set_errno (save); - if (tmp == NULL) - { - /* Couldn't get space, so use the slower algorithm - that doesn't need a temporary array. */ - _quicksort (b, n, s, cmp, arg); - return; - } - p.t = tmp; - } - - p.s = s; - p.var = 4; - p.cmp = cmp; - p.arg = arg; - - if (s > 32) - { - /* Indirect sorting. */ - char *ip = (char *) b; - void **tp = (void **) (p.t + n * sizeof (void *)); - void **t = tp; - void *tmp_storage = (void *) (tp + n); - - while ((void *) t < tmp_storage) - { - *t++ = ip; - ip += s; - } - p.s = sizeof (void *); - p.var = 3; - msort_with_tmp (&p, p.t + n * sizeof (void *), n); - - /* tp[0] .. tp[n - 1] is now sorted, copy around entries of - the original array. Knuth vol. 3 (2nd ed.) exercise 5.2-10. */ - char *kp; - size_t i; - for (i = 0, ip = (char *) b; i < n; i++, ip += s) - if ((kp = tp[i]) != ip) - { - size_t j = i; - char *jp = ip; - memcpy (tmp_storage, ip, s); - - do - { - size_t k = (kp - (char *) b) / s; - tp[j] = jp; - memcpy (jp, kp, s); - j = k; - jp = kp; - kp = tp[k]; - } - while (kp != ip); - - tp[j] = jp; - memcpy (jp, tmp_storage, s); - } - } - else - { - if ((s & (sizeof (uint32_t) - 1)) == 0 - && ((uintptr_t) b) % __alignof__ (uint32_t) == 0) - { - if (s == sizeof (uint32_t)) - p.var = 0; - else if (s == sizeof (uint64_t) - && ((uintptr_t) b) % __alignof__ (uint64_t) == 0) - p.var = 1; - else if ((s & (sizeof (unsigned long) - 1)) == 0 - && ((uintptr_t) b) - % __alignof__ (unsigned long) == 0) - p.var = 2; - } - msort_with_tmp (&p, b, n); - } - free (tmp); -} -libc_hidden_def (__qsort_r) -weak_alias (__qsort_r, qsort_r) - - -void -qsort (void *b, size_t n, size_t s, __compar_fn_t cmp) -{ - return __qsort_r (b, n, s, (__compar_d_fn_t) cmp, NULL); -} -libc_hidden_def (qsort) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index ed0ac53467..0bcbb6a14a 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -19,7 +19,6 @@ Engineering a sort function; Jon Bentley and M. Douglas McIlroy; Software - Practice and Experience; Vol. 23 (11), 1249-1265, 1993. */ -#include #include #include #include @@ -284,8 +283,8 @@ insertion_sort_qsort_partitions (void *const pbase, size_t total_elems, } void -_quicksort (void *const pbase, size_t total_elems, size_t size, - __compar_d_fn_t cmp, void *arg) +__qsort_r (void *const pbase, size_t total_elems, size_t size, + __compar_d_fn_t cmp, void *arg) { char *base_ptr = (char *) pbase; @@ -417,3 +416,12 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, insertion_sort_qsort_partitions (pbase, total_elems, size, swap_type, cmp, arg); } +libc_hidden_def (__qsort_r) +weak_alias (__qsort_r, qsort_r) + +void +qsort (void *b, size_t n, size_t s, __compar_fn_t cmp) +{ + return __qsort_r (b, n, s, (__compar_d_fn_t) cmp, NULL); +} +libc_hidden_def (qsort) From patchwork Thu Jul 13 13:25:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 72644 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1606E385AF9E for ; Thu, 13 Jul 2023 13:26:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1606E385AF9E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1689254796; bh=CIQPanB2+NjpW4wZ2B7qmNAdy4js4l2VHeg4oX1kc8Q=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=IRchxtt77ASYdQrcOf7C2eqYCGxWpaWbfGzAKcqDZ2PKTAS35BPWkEQp02Lv8rBHq 83GVvSMRHqTp/YmtDmJy/amVmP3INcQingGtoljrftx/RX6wO5LTd1GOS6SkqnX7fP KDHlr1mbuPJnDMZgx6oSci21XI+5Jkq2FwSei/6U= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x34.google.com (mail-oa1-x34.google.com [IPv6:2001:4860:4864:20::34]) by sourceware.org (Postfix) with ESMTPS id 442E8385770D for ; Thu, 13 Jul 2023 13:25:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 442E8385770D Received: by mail-oa1-x34.google.com with SMTP id 586e51a60fabf-1b059dd7c0cso622863fac.0 for ; Thu, 13 Jul 2023 06:25:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689254752; x=1691846752; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CIQPanB2+NjpW4wZ2B7qmNAdy4js4l2VHeg4oX1kc8Q=; b=ijBnvLXXWrkQnwCdj0a9LtAgAt9kFeUwg/7zHDTAKFl+1ucljGqianDxnXy4YNEqE9 Zxd/N0hb1YlWQ/srtaUAlfqSGfogV55fSvoAEQbJhxhMEEZf5jv8+QYaGsS/QswNEp4l 0w40Rfcxwsrww2ZtKcr5+8ryVz0MRQjFBnpbbnTiaDQNf44BhrSOf6EXyiq3T9nzUbc0 Gd7bMXBeoyACtD3LyVv/15rNtp6ryLg93WicAqJLQvQMx0zgJFDYolTU06jnp08J4BZ0 n5r4oOjy+kxD43IwQ/toKvttxexe2OH4Srz0hakZGuQdF6nk40X3HRSAo4LL+38TKtzc 6NRQ== X-Gm-Message-State: ABy/qLZg+jKUZK515KDJcomTP1pX3qhp/1iq9y/Fw+MNmMpho9mCQXU7 9MtmuNxPVW8w6vUa0+tfdpgLXOOJJVdrRXwd+sJKmQ== X-Google-Smtp-Source: APBJJlFvaD2Y4wVMtLeuqgo71fh1/gn1j3qDQdFgZDMRt5O8z5REZZVvC2k/yBhNN3PSu2bij/q2uA== X-Received: by 2002:a05:6870:8304:b0:1b4:6d3b:3e15 with SMTP id p4-20020a056870830400b001b46d3b3e15mr2084225oae.3.1689254751695; Thu, 13 Jul 2023 06:25:51 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:5656:8d9b:78e7:d879:79cf]) by smtp.gmail.com with ESMTPSA id d195-20020a4a52cc000000b00541fbbbcd31sm2888856oob.5.2023.07.13.06.25.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 06:25:51 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v5 6/6] stdlib: Add more qsort{_r} coverage Date: Thu, 13 Jul 2023 10:25:40 -0300 Message-Id: <20230713132540.2854320-7-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> References: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" This patch adds a qsort and qsort_r (which glibc current lacks coverage). The test is done with random input, dfferent internal types (uint8_t, uint16_t, uint32_t, and uint64_t), and with different set of element numbers (from 0 to 262144). Checked on x86_64-linux-gnu. --- stdlib/Makefile | 1 + stdlib/tst-qsort3.c | 298 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 299 insertions(+) create mode 100644 stdlib/tst-qsort3.c diff --git a/stdlib/Makefile b/stdlib/Makefile index 095518eef4..6af606136e 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -214,6 +214,7 @@ tests := \ tst-on_exit \ tst-qsort \ tst-qsort2 \ + tst-qsort3 \ tst-quick_exit \ tst-rand48 \ tst-rand48-2 \ diff --git a/stdlib/tst-qsort3.c b/stdlib/tst-qsort3.c new file mode 100644 index 0000000000..6940540289 --- /dev/null +++ b/stdlib/tst-qsort3.c @@ -0,0 +1,298 @@ +/* qsort(_r) tests to trigger worst case for quicksort. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +typedef enum +{ + Sorted, + Random, + Repeated, + Bitonic +} arraytype_t; + +/* Ratio of total of elements which will be repeated. */ +static const double RepeatedRatio = 0.2; + +struct array_t +{ + arraytype_t type; + const char *name; +} static const arraytypes[] = +{ + { Sorted, "Sorted" }, + { Random, "Random" }, + { Repeated, "Repeated" }, + { Bitonic, "Bitonic" }, +}; + +/* Return the index of BASE as interpreted as an array of elements + of size SIZE. */ +static inline void * +arr (void *base, size_t idx, size_t size) +{ + return (void*)((uintptr_t)base + (idx * size)); +} + +/* Functions used to check qsort. */ +static int +uint8_t_cmp (const void *a, const void *b) +{ + uint8_t ia = *(uint8_t*)a; + uint8_t ib = *(uint8_t*)b; + return (ia > ib) - (ia < ib); +} + +static int +uint16_t_cmp (const void *a, const void *b) +{ + uint16_t ia = *(uint16_t*)a; + uint16_t ib = *(uint16_t*)b; + return (ia > ib) - (ia < ib); +} + +static int +uint32_t_cmp (const void *a, const void *b) +{ + uint32_t ia = *(uint32_t*)a; + uint32_t ib = *(uint32_t*)b; + return (ia > ib) - (ia < ib); +} + +static int +uint64_t_cmp (const void *a, const void *b) +{ + uint64_t ia = *(uint64_t*)a; + uint64_t ib = *(uint64_t*)b; + return (ia > ib) - (ia < ib); +} + +#define LARGE_SIZE 48 + +static int +large_cmp (const void *a, const void *b) +{ + return memcmp (a, b, LARGE_SIZE); +} + +/* Function used to check qsort_r. */ +typedef enum +{ + UINT8_CMP_T, + UINT16_CMP_T, + UINT32_CMP_T, + UINT64_CMP_T, + LARGE_CMP_T +} type_cmp_t; + +static type_cmp_t +uint_t_cmp_type (size_t sz) +{ + switch (sz) + { + case sizeof (uint8_t): return UINT8_CMP_T; + case sizeof (uint16_t): return UINT16_CMP_T; + case sizeof (uint64_t): return UINT64_CMP_T; + case sizeof (uint32_t): return UINT32_CMP_T; + default: return LARGE_CMP_T; + } +} + +static int +uint_t_cmp (const void *a, const void *b, void *arg) +{ + type_cmp_t type = *(type_cmp_t*) arg; + switch (type) + { + case UINT8_CMP_T: return uint8_t_cmp (a, b); + case UINT32_CMP_T: return uint32_t_cmp (a, b); + case UINT16_CMP_T: return uint16_t_cmp (a, b); + case UINT64_CMP_T: return uint64_t_cmp (a, b); + default: return large_cmp (a, b); + } +} + +static void +seq (void *elem, size_t type_size, int value) +{ + if (type_size == sizeof (uint8_t)) + *(uint8_t*)elem = value; + else if (type_size == sizeof (uint16_t)) + *(uint16_t*)elem = value; + else if (type_size == sizeof (uint32_t)) + *(uint32_t*)elem = value; + else if (type_size == sizeof (uint64_t)) + *(uint64_t*)elem = value; + else + memset (elem, value, type_size); +} + +static void * +create_array (size_t nmemb, size_t type_size, arraytype_t type) +{ + size_t size = nmemb * type_size; + void *array = xmalloc (size); + + switch (type) + { + case Sorted: + for (size_t i = 0; i < nmemb; i++) + seq (arr (array, i, type_size), type_size, i); + break; + + case Random: + arc4random_buf (array, size); + break; + + case Repeated: + { + arc4random_buf (array, size); + + void *randelem = xmalloc (type_size); + arc4random_buf (randelem, type_size); + + /* Repeat REPEATED elements (based on RepeatRatio ratio) in the random + array. */ + size_t repeated = (size_t)(nmemb * RepeatedRatio); + for (size_t i = 0; i < repeated; i++) + { + size_t pos = arc4random_uniform (nmemb - 1); + memcpy (arr (array, pos, type_size), randelem, type_size); + } + free (randelem); + } + break; + + case Bitonic: + { + size_t i; + for (i = 0; i < nmemb / 2; i++) + seq (arr (array, i, type_size), type_size, i); + for ( ; i < nmemb; i++) + seq (arr (array, i, type_size), type_size, (nmemb - 1) - i); + } + break; + } + + return array; +} + +typedef int (*cmpfunc_t)(const void *, const void *); + +/* Check if ARRAY of total NMEMB element of size SIZE is sorted + based on CMPFUNC. */ +static void +check_array (void *array, size_t nmemb, size_t type_size, + cmpfunc_t cmpfunc) +{ + for (size_t i = 1; i < nmemb; i++) + { + int ret = cmpfunc (arr (array, i, type_size), + arr (array, i-1, type_size)); + TEST_VERIFY_EXIT (ret >= 0); + } +} + +static void +check_qsort (size_t nelem, size_t type_size, arraytype_t type, + cmpfunc_t cmpfunc) +{ + void *array = create_array (nelem, type_size, type); + + qsort (array, nelem, type_size, cmpfunc); + + check_array (array, nelem, type_size, cmpfunc); + + free (array); +} + +static void +check_qsort_r (size_t nelem, size_t type_size, arraytype_t type, + cmpfunc_t cmpfunc) +{ + void *array = create_array (nelem, type_size, type); + + type_cmp_t typecmp = uint_t_cmp_type (type_size); + qsort_r (array, nelem, type_size, uint_t_cmp, &typecmp); + + check_array (array, nelem, type_size, cmpfunc); + + free (array); +} + +static int +do_test (void) +{ + /* Some random sizes. */ + const size_t nelems[] = { 0, 1, 7, 20, 32, 100, 256, 1024, 8560, 17384 }; + + struct test_t + { + size_t type_size; + cmpfunc_t cmpfunc; + } + const tests[] = + { + { sizeof (uint8_t), uint8_t_cmp }, + { sizeof (uint16_t), uint16_t_cmp }, + { sizeof (uint32_t), uint32_t_cmp }, + { sizeof (uint64_t), uint64_t_cmp }, + /* Test swap with large elements. */ + { LARGE_SIZE, large_cmp }, + }; + + for (const struct test_t *test = tests; test < array_end (tests); ++test) + { + if (test_verbose > 0) + printf ("info: testing qsort with type_size=%zu\n", test->type_size); + for (const struct array_t *arraytype = arraytypes; + arraytype < array_end (arraytypes); + ++arraytype) + { + if (test_verbose > 0) + printf (" distribution=%s\n", arraytype->name); + for (const size_t *nelem = nelems; + nelem < array_end (nelems); + ++nelem) + { + if (test_verbose > 0) + printf (" i nelem=%zu, total size=%zu\n", *nelem, + *nelem * test->type_size); + + check_qsort (*nelem, test->type_size, arraytype->type, + test->cmpfunc); + check_qsort_r (*nelem, test->type_size, arraytype->type, + test->cmpfunc); + } + } + } + + return 0; +} + +#include