From patchwork Thu Sep 14 18:36:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 75998 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 32FE6385DC15 for ; Thu, 14 Sep 2023 18:37:27 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oo1-xc34.google.com (mail-oo1-xc34.google.com [IPv6:2607:f8b0:4864:20::c34]) by sourceware.org (Postfix) with ESMTPS id D45043858C52 for ; Thu, 14 Sep 2023 18:37:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D45043858C52 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oo1-xc34.google.com with SMTP id 006d021491bc7-5739972accdso1281158eaf.1 for ; Thu, 14 Sep 2023 11:37:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1694716631; x=1695321431; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=3gxuH9ByZCz9s0Dh7UOISa4Ep/Exr8sPymluuI3E+qI=; b=BG3I9AD2KgcgTwVa2v/Vp5OiXlpxYwTGfALovC+lfWoJZ08G3STG/SHpJJoqWmQ4gu 4c3miGiDSmyYPqfAq/3Fo9xYX5cG5zOP+NqYgUV+3XdUwln0emORBjkfgMLFuM/MV1G4 xwZK2qdN7e95DK3I2pSiKUiy0GFZlcKY01TDXuYDVjciGiCY+mr7NTR1CN70M3Q2cjF6 bjGv4qb3SkBevwU7nmsm+IBafjZKEkQjMZ2BDI8/cVoVtt0yK7b56MbtH3ZNd3PA1nuP 4xAm7UkhM3aEROr/bDrEwGKgcTwmJAQfcHte2cv+JhocZ0DlS6GuT3QypH7dqm4Hu/0N JGMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694716631; x=1695321431; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3gxuH9ByZCz9s0Dh7UOISa4Ep/Exr8sPymluuI3E+qI=; b=PJdEeOgMZOzOqTl1hzd4HxyFSxgZJeo95/eoR7fXX0kSiHirIUVoaGgjizmytBaQRS v+lAJSzsZTKAPWNOrWMQnsUd/6yETscpjHfL/KLs7IXArrYVbyaljRG+RU1S10H9aw0G VOM8NiiF4m8MRtFIUyJUPQs4oL1RUnpVT/zpWFDgv+qPF4npH0xyVF3jZUhT0qHV7Jwn ZLoYAkVIrOQkRVIK6run73U5eaINj/5tHP7mmUSQdxBvadrrfxvELBWBjT+s3PlrnReX 3QUWv7dYqnIZpsuyPxhnKLh1Dty1HS6xw4ZbQcmRb1ff9DqYkGfwhFvjre0jRbtdy18G 06Pg== X-Gm-Message-State: AOJu0YwMX/UYyM2sAbjJ4PE6dKqmGDUwcuvUM62p9k/vuHtUuV6aRzqs /kOvKM8JQ22b9bKYlJIxpcdBz/8C2WQ4tUza7zjdJQ== X-Google-Smtp-Source: AGHT+IEe+oRpFfpkTZDw+PonRkK6uVvnsHleqGIQ5LoGSuScbNk8cQetR5SjlUmA11t/URpuLfM9hQ== X-Received: by 2002:a05:6870:969f:b0:1bb:48c5:9d56 with SMTP id o31-20020a056870969f00b001bb48c59d56mr1951755oaq.18.1694716631546; Thu, 14 Sep 2023 11:37:11 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:91cb:5173:d70:d8d:4d2c]) by smtp.gmail.com with ESMTPSA id cc6-20020a056871e18600b001b3d67934e9sm1056930oac.26.2023.09.14.11.37.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 11:37:10 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , Paul Eggert , Florian Weimer Subject: [PATCH v7 1/7] string: Add internal memswap implementation Date: Thu, 14 Sep 2023 15:36:58 -0300 Message-Id: <20230914183704.1386534-2-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> References: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org The prototype is: void __memswap (void *restrict p1, void *restrict p2, size_t n) The function swaps the content of two memory blocks P1 and P2 of len N. Memory overlap is NOT handled. It will be used on qsort optimization. Checked on x86_64-linux-gnu and aarch64-linux-gnu. --- include/string.h | 3 + string/Makefile | 13 +++ string/memswap.c | 41 +++++++++ string/test-memswap.c | 191 ++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 248 insertions(+) create mode 100644 string/memswap.c create mode 100644 string/test-memswap.c diff --git a/include/string.h b/include/string.h index 86d1fa4abe..f31a7f6006 100644 --- a/include/string.h +++ b/include/string.h @@ -45,6 +45,9 @@ extern void *__memrchr (const void *__s, int __c, size_t __n) extern void *__memchr (const void *__s, int __c, size_t __n) __attribute_pure__; +extern void __memswap (void *restrict __p1, void *restrict __p2, size_t __n) + attribute_hidden; + extern void __bzero (void *__s, size_t __n) __THROW __nonnull ((1)); extern int __ffs (int __i) __attribute__ ((const)); diff --git a/string/Makefile b/string/Makefile index 8cdfd5b000..3d0a3d5682 100644 --- a/string/Makefile +++ b/string/Makefile @@ -69,6 +69,7 @@ routines := \ mempcpy \ memrchr \ memset \ + memswap \ rawmemchr \ sigabbrev_np \ sigdescr_np \ @@ -209,6 +210,18 @@ tests := \ tst-xbzero-opt \ # tests +tests-static-internal := \ + test-memswap \ +# tests-static-internal + +tests-internal := \ + $(tests-static-internal) \ + # tests-internal + +tests-static := \ + $(tests-static-internal) \ + # tests-static + # Both tests require the .mo translation files generated by msgfmt. tests-translation := \ tst-strerror \ diff --git a/string/memswap.c b/string/memswap.c new file mode 100644 index 0000000000..37640c47ad --- /dev/null +++ b/string/memswap.c @@ -0,0 +1,41 @@ +/* Swap the content of two memory blocks, overlap is NOT handled. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +void +__memswap (void *restrict p1, void *restrict p2, size_t n) +{ + /* Use multiple small memcpys with constant size to enable inlining on most + targets. */ + enum { SWAP_GENERIC_SIZE = 32 }; + unsigned char tmp[SWAP_GENERIC_SIZE]; + while (n > SWAP_GENERIC_SIZE) + { + memcpy (tmp, p1, SWAP_GENERIC_SIZE); + p1 = __mempcpy (p1, p2, SWAP_GENERIC_SIZE); + p2 = __mempcpy (p2, tmp, SWAP_GENERIC_SIZE); + n -= SWAP_GENERIC_SIZE; + } + while (n > 0) + { + unsigned char t = ((unsigned char *)p1)[--n]; + ((unsigned char *)p1)[n] = ((unsigned char *)p2)[n]; + ((unsigned char *)p2)[n] = t; + } +} diff --git a/string/test-memswap.c b/string/test-memswap.c new file mode 100644 index 0000000000..7696c43711 --- /dev/null +++ b/string/test-memswap.c @@ -0,0 +1,191 @@ +/* Test and measure memcpy functions. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define TEST_MAIN +#define BUF1PAGES 3 +#include "test-string.h" + +static unsigned char *ref1; +static unsigned char *ref2; + +static void +do_one_test (unsigned char *p1, unsigned char *ref1, unsigned char *p2, + unsigned char *ref2, size_t len) +{ + __memswap (p1, p2, len); + + TEST_COMPARE_BLOB (p1, len, ref2, len); + TEST_COMPARE_BLOB (p2, len, ref1, len); +} + +static inline void +do_test (size_t align1, size_t align2, size_t len) +{ + align1 &= page_size; + if (align1 + len >= page_size) + return; + + align2 &= page_size; + if (align2 + len >= page_size) + return; + + unsigned char *p1 = buf1 + align1; + unsigned char *p2 = buf2 + align2; + for (size_t repeats = 0; repeats < 2; ++repeats) + { + size_t i, j; + for (i = 0, j = 1; i < len; i++, j += 23) + { + ref1[i] = p1[i] = j; + ref2[i] = p2[i] = UCHAR_MAX - j; + } + + do_one_test (p1, ref1, p2, ref2, len); + } +} + +static void +do_random_tests (void) +{ + for (size_t n = 0; n < ITERATIONS; n++) + { + size_t len, size, size1, size2, align1, align2; + + if (n == 0) + { + len = getpagesize (); + size = len + 512; + size1 = size; + size2 = size; + align1 = 512; + align2 = 512; + } + else + { + if ((random () & 255) == 0) + size = 65536; + else + size = 768; + if (size > page_size) + size = page_size; + size1 = size; + size2 = size; + size_t i = random (); + if (i & 3) + size -= 256; + if (i & 1) + size1 -= 256; + if (i & 2) + size2 -= 256; + if (i & 4) + { + len = random () % size; + align1 = size1 - len - (random () & 31); + align2 = size2 - len - (random () & 31); + if (align1 > size1) + align1 = 0; + if (align2 > size2) + align2 = 0; + } + else + { + align1 = random () & 63; + align2 = random () & 63; + len = random () % size; + if (align1 + len > size1) + align1 = size1 - len; + if (align2 + len > size2) + align2 = size2 - len; + } + } + unsigned char *p1 = buf1 + page_size - size1; + unsigned char *p2 = buf2 + page_size - size2; + size_t j = align1 + len + 256; + if (j > size1) + j = size1; + for (size_t i = 0; i < j; ++i) + ref1[i] = p1[i] = random () & 255; + + j = align2 + len + 256; + if (j > size2) + j = size2; + + for (size_t i = 0; i < j; ++i) + ref2[i] = p2[i] = random () & 255; + + do_one_test (p1 + align1, ref1 + align1, p2 + align2, ref2 + align2, len); + } +} + +static int +test_main (void) +{ + test_init (); + /* Use the start of buf1 for reference buffers. */ + ref1 = buf1; + ref2 = buf1 + page_size; + buf1 = ref2 + page_size; + + printf ("%23s", ""); + printf ("\t__memswap\n"); + + for (size_t i = 0; i < 18; ++i) + { + do_test (0, 0, 1 << i); + do_test (i, 0, 1 << i); + do_test (0, i, 1 << i); + do_test (i, i, 1 << i); + } + + for (size_t i = 0; i < 32; ++i) + { + do_test (0, 0, i); + do_test (i, 0, i); + do_test (0, i, i); + do_test (i, i, i); + } + + for (size_t i = 3; i < 32; ++i) + { + if ((i & (i - 1)) == 0) + continue; + do_test (0, 0, 16 * i); + do_test (i, 0, 16 * i); + do_test (0, i, 16 * i); + do_test (i, i, 16 * i); + } + + for (size_t i = 19; i <= 25; ++i) + { + do_test (255, 0, 1 << i); + do_test (0, 4000, 1 << i); + do_test (0, 255, i); + do_test (0, 4000, i); + } + + do_test (0, 0, getpagesize ()); + + do_random_tests (); + + return 0; +} + +#include From patchwork Thu Sep 14 18:36:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 75999 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3A26E3856090 for ; Thu, 14 Sep 2023 18:37:29 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x30.google.com (mail-oa1-x30.google.com [IPv6:2001:4860:4864:20::30]) by sourceware.org (Postfix) with ESMTPS id 4AD1C385771E for ; Thu, 14 Sep 2023 18:37:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4AD1C385771E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oa1-x30.google.com with SMTP id 586e51a60fabf-1d640e49e0aso358817fac.2 for ; Thu, 14 Sep 2023 11:37:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1694716634; x=1695321434; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=I/mdUPzISC+XmuJBFaDw8dL6gLvuN2O/2+RJn1xOxGA=; b=WLbuDEUwwsUgT03m8URvmSvVZJVmVc4nBcume2nCsXLU4GfCGyI8jzCSiScacHrRUH AKFY9RelqRL+C27wkuzwKUTXTODgQ4JAhspsKK177hsLbtXL+j3oh6swouA2zjJJNzGL iAFvxHXiXRYFxG104fR55TGXtilMMGgnm2xx5BSuf+C7TrhYiBczskq+49n3QZOAEa8G A1rknCvIZRxKfQ1okUe5s1sWfkpf2IxMHaq+qns7nCICOOizRVAefoP2wR3dE1tX5QvK o/mKuTaNbRs5Zn0oxTFtCE3THNnXbg+3lNxEqa4qularCZR92wZa/vG7PhaOgEydw1A4 rnYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694716634; x=1695321434; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I/mdUPzISC+XmuJBFaDw8dL6gLvuN2O/2+RJn1xOxGA=; b=XekYqAETkeioXixBm49SHg9p/kDVY7PF1D8k8HGvlzAF8O1bKcTsOoh6THu2tAhqAD IMUi/f87o9Ebrqe+VW9NTmGw8rH2HRFkCegR5tZq9iWcUZvHy7WsWeo3Tnvs7wKuzuMp /zQ0DSy/eT9QZoCnKFTYif6w6+qK3aRxsvnOodbxW/LKiOUFHsUM5jD1eWW8mZpsuGgs tpfrAAMS+CCqVTSRO5v57yVI0n67dFQcgimabN2zZMPfDKqeqkAiC7irEL7tFkhyGXhx qflkOVR90NuVP85ndgJbHgIxEXm8CVahIQp6eXQrutZlDd790EijixzOwj56Z0jUwewm JRwA== X-Gm-Message-State: AOJu0YxW16V6ta/HIK7EwNASRMAdARZ21wc6s98pS4CEPT33bB4zgp7/ CtLAQcgkapyjOq3PAJo6mRwGeY8beuiI4CMH4FP3VA== X-Google-Smtp-Source: AGHT+IHlSCyb2TFnTY/OJqVs85PAC7RemQI98pzhSD6Bx88u6ycrheTncj0+sqZtlsKWo3zk1VlIZA== X-Received: by 2002:a05:6871:797:b0:1d5:8f05:39da with SMTP id o23-20020a056871079700b001d58f0539damr6317636oap.25.1694716633988; Thu, 14 Sep 2023 11:37:13 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:91cb:5173:d70:d8d:4d2c]) by smtp.gmail.com with ESMTPSA id cc6-20020a056871e18600b001b3d67934e9sm1056930oac.26.2023.09.14.11.37.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 11:37:13 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , Paul Eggert , Florian Weimer Subject: [PATCH v7 2/7] stdlib: Optimization qsort{_r} swap implementation Date: Thu, 14 Sep 2023 15:36:59 -0300 Message-Id: <20230914183704.1386534-3-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> References: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org The optimization takes in consideration both the most common elements are either 32 or 64 bit in size and inputs are aligned to the word boundary. This is similar to what msort does. For large buffer the swap operation uses memcpy/mempcpy with a small fixed size buffer (so compiler might inline the operations). Checked on x86_64-linux-gnu. --- stdlib/qsort.c | 94 ++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 76 insertions(+), 18 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 728a0ed370..bba9783191 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -23,20 +23,70 @@ #include #include #include +#include -/* Byte-wise swap two items of size SIZE. */ -#define SWAP(a, b, size) \ - do \ - { \ - size_t __size = (size); \ - char *__a = (a), *__b = (b); \ - do \ - { \ - char __tmp = *__a; \ - *__a++ = *__b; \ - *__b++ = __tmp; \ - } while (--__size > 0); \ - } while (0) +/* Swap SIZE bytes between addresses A and B. These helpers are provided + along the generic one as an optimization. */ + +enum swap_type_t + { + SWAP_WORDS_64, + SWAP_WORDS_32, + SWAP_BYTES + }; + +/* If this function returns true, elements can be safely copied using word + loads and stores. Otherwise, it might not be safe. BASE (as an integer) + must be a multiple of the word alignment. SIZE must be a multiple of + WORDSIZE. Since WORDSIZE must be a multiple of the word alignment, and + WORDSIZE is a power of two on all supported platforms, this function for + speed merely checks that BASE and SIZE are both multiples of the word + size. */ +static inline bool +is_aligned (const void *base, size_t size, size_t wordsize) +{ + return (((uintptr_t) base | size) & (wordsize - 1)) == 0; +} + +static inline void +swap_words_64 (void * restrict a, void * restrict b, size_t n) +{ + typedef uint64_t __attribute__ ((__may_alias__)) u64_alias_t; + do + { + n -= 8; + u64_alias_t t = *(u64_alias_t *)(a + n); + *(u64_alias_t *)(a + n) = *(u64_alias_t *)(b + n); + *(u64_alias_t *)(b + n) = t; + } while (n); +} + +static inline void +swap_words_32 (void * restrict a, void * restrict b, size_t n) +{ + typedef uint32_t __attribute__ ((__may_alias__)) u32_alias_t; + do + { + n -= 4; + u32_alias_t t = *(u32_alias_t *)(a + n); + *(u32_alias_t *)(a + n) = *(u32_alias_t *)(b + n); + *(u32_alias_t *)(b + n) = t; + } while (n); +} + +/* Replace the indirect call with a serie of if statements. It should help + the branch predictor. */ +static void +do_swap (void * restrict a, void * restrict b, size_t size, + enum swap_type_t swap_type) +{ + if (swap_type == SWAP_WORDS_64) + swap_words_64 (a, b, size); + else if (swap_type == SWAP_WORDS_32) + swap_words_32 (a, b, size); + else + __memswap (a, b, size); +} /* Discontinue quicksort algorithm when partition gets below this size. This particular magic number was chosen to work best on a Sun 4/260. */ @@ -96,6 +146,14 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, /* Avoid lossage with unsigned arithmetic below. */ return; + enum swap_type_t swap_type; + if (is_aligned (pbase, size, 8)) + swap_type = SWAP_WORDS_64; + else if (is_aligned (pbase, size, 4)) + swap_type = SWAP_WORDS_32; + else + swap_type = SWAP_BYTES; + if (total_elems > MAX_THRESH) { char *lo = base_ptr; @@ -119,13 +177,13 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, char *mid = lo + size * ((hi - lo) / size >> 1); if ((*cmp) ((void *) mid, (void *) lo, arg) < 0) - SWAP (mid, lo, size); + do_swap (mid, lo, size, swap_type); if ((*cmp) ((void *) hi, (void *) mid, arg) < 0) - SWAP (mid, hi, size); + do_swap (mid, hi, size, swap_type); else goto jump_over; if ((*cmp) ((void *) mid, (void *) lo, arg) < 0) - SWAP (mid, lo, size); + do_swap (mid, lo, size, swap_type); jump_over:; left_ptr = lo + size; @@ -144,7 +202,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, if (left_ptr < right_ptr) { - SWAP (left_ptr, right_ptr, size); + do_swap (left_ptr, right_ptr, size, swap_type); if (mid == left_ptr) mid = right_ptr; else if (mid == right_ptr) @@ -216,7 +274,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, tmp_ptr = run_ptr; if (tmp_ptr != base_ptr) - SWAP (tmp_ptr, base_ptr, size); + do_swap (tmp_ptr, base_ptr, size, swap_type); /* Insertion sort, running from left-hand-side up to right-hand-side. */ From patchwork Thu Sep 14 18:37:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 76000 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B83BE385040B for ; Thu, 14 Sep 2023 18:37:38 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oi1-x22a.google.com (mail-oi1-x22a.google.com [IPv6:2607:f8b0:4864:20::22a]) by sourceware.org (Postfix) with ESMTPS id 8B358385697B for ; Thu, 14 Sep 2023 18:37:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8B358385697B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oi1-x22a.google.com with SMTP id 5614622812f47-3a7ca8720a0so783074b6e.2 for ; Thu, 14 Sep 2023 11:37:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1694716636; x=1695321436; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=M3zU+xyjjiEt0GUXw16fcI1DylY5neN39pRNE2Kebrw=; b=EXAVT8+d8MFqNFuDqcq/D19YUqVAmlQRS9p3pU4Klfwi5e4u+qbpvdiNfykp4gY9xl DaBvKnIkbTpdcEU/1xmwn61Hh08bcQLp73j6WIuRfAKWeAcvAIQJn32l9Yn56DhPLW00 uNmmrOM/IDk//hfIG71bLnehx0XfNRryhfefGTM4pl8zWCOikKeG8EQ9E0DWTWPPsGPV br+SFsNbC3IACl4eDcwydFU0uTj4eX7hHuAiOMRmTBIUJV7ADZIJcAmRdwNzJwiKzBW7 tBriBhETPbbn7V7VBrZizdl2VIxSoQQnEjsaY13+rDUBEHKrTIcIl9EO8Tx/N/GEBxGY w+cA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694716636; x=1695321436; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M3zU+xyjjiEt0GUXw16fcI1DylY5neN39pRNE2Kebrw=; b=OsuFAWf9UU0LoHf5U+1K0Q1BJpp93J4ap2gAFeWXC4TqmAgsF8VQTqOdj/LIoY76j9 ziGniQFoWxTRzfHTqmJohOk/0DzkWmNANkkyHmT4vh71bZ6aDSGpqBGf5N5FOI9vUZoy GScbDmPigBYFBzbAP7jVyxxp+EaIY6IABwKRMFjrLfJDQV6u+zF/iKCWtfPBYHd+jXzb Ynw7m7nb1NSA4ODGxvIZYg4StI4FYAkt0FP1S8vspc8d/25ZtwJB4h5rlgG44PitHbm3 tjx01czH9ouiQ+RPSrGfsmLMv1QyKc3Tf7+iwKASkL1bWJsGK3MnuIP1Zxubke0qNYTw 2REw== X-Gm-Message-State: AOJu0Yzn+ZWpGoFNnPGAdsl2q+YgsiO/+Rbk2fl2kRSbyTBaCRfQPKY6 qKAqNpr8eQmiW2djluS7fxIq3yWwHozc4Z0Xgs3HVA== X-Google-Smtp-Source: AGHT+IEesZlKhpBzagHv6Qmrn2D7OFsQPjM6JbJqciWORDRflLyMKIOFCRfCJD2BEPntotBD7ZLhvQ== X-Received: by 2002:a05:6870:5621:b0:1d6:925:840d with SMTP id m33-20020a056870562100b001d60925840dmr7201995oao.55.1694716636318; Thu, 14 Sep 2023 11:37:16 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:91cb:5173:d70:d8d:4d2c]) by smtp.gmail.com with ESMTPSA id cc6-20020a056871e18600b001b3d67934e9sm1056930oac.26.2023.09.14.11.37.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 11:37:15 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , Paul Eggert , Florian Weimer Subject: [PATCH v7 3/7] stdlib: Move insertion sort out qsort Date: Thu, 14 Sep 2023 15:37:00 -0300 Message-Id: <20230914183704.1386534-4-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> References: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org --- stdlib/qsort.c | 101 ++++++++++++++++++++++++++----------------------- 1 file changed, 54 insertions(+), 47 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index bba9783191..30e2999005 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -110,6 +110,58 @@ typedef struct #define STACK_NOT_EMPTY (stack < top) +static inline void +insertion_sort_qsort_partitions (void *const pbase, size_t total_elems, + size_t size, enum swap_type_t swap_type, + __compar_d_fn_t cmp, void *arg) +{ + char *base_ptr = (char *) pbase; + char *const end_ptr = &base_ptr[size * (total_elems - 1)]; + char *tmp_ptr = base_ptr; +#define min(x, y) ((x) < (y) ? (x) : (y)) + const size_t max_thresh = MAX_THRESH * size; + char *thresh = min(end_ptr, base_ptr + max_thresh); + char *run_ptr; + + /* Find smallest element in first threshold and place it at the + array's beginning. This is the smallest array element, + and the operation speeds up insertion sort's inner loop. */ + + for (run_ptr = tmp_ptr + size; run_ptr <= thresh; run_ptr += size) + if (cmp (run_ptr, tmp_ptr, arg) < 0) + tmp_ptr = run_ptr; + + if (tmp_ptr != base_ptr) + do_swap (tmp_ptr, base_ptr, size, swap_type); + + /* Insertion sort, running from left-hand-side up to right-hand-side. */ + + run_ptr = base_ptr + size; + while ((run_ptr += size) <= end_ptr) + { + tmp_ptr = run_ptr - size; + while (cmp (run_ptr, tmp_ptr, arg) < 0) + tmp_ptr -= size; + + tmp_ptr += size; + if (tmp_ptr != run_ptr) + { + char *trav; + + trav = run_ptr + size; + while (--trav >= run_ptr) + { + char c = *trav; + char *hi, *lo; + + for (hi = lo = trav; (lo -= size) >= tmp_ptr; hi = lo) + *hi = *lo; + *hi = c; + } + } + } +} + /* Order size using quicksort. This implementation incorporates four optimizations discussed in Sedgewick: @@ -256,51 +308,6 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, for partitions below MAX_THRESH size. BASE_PTR points to the beginning of the array to sort, and END_PTR points at the very last element in the array (*not* one beyond it!). */ - -#define min(x, y) ((x) < (y) ? (x) : (y)) - - { - char *const end_ptr = &base_ptr[size * (total_elems - 1)]; - char *tmp_ptr = base_ptr; - char *thresh = min(end_ptr, base_ptr + max_thresh); - char *run_ptr; - - /* Find smallest element in first threshold and place it at the - array's beginning. This is the smallest array element, - and the operation speeds up insertion sort's inner loop. */ - - for (run_ptr = tmp_ptr + size; run_ptr <= thresh; run_ptr += size) - if ((*cmp) ((void *) run_ptr, (void *) tmp_ptr, arg) < 0) - tmp_ptr = run_ptr; - - if (tmp_ptr != base_ptr) - do_swap (tmp_ptr, base_ptr, size, swap_type); - - /* Insertion sort, running from left-hand-side up to right-hand-side. */ - - run_ptr = base_ptr + size; - while ((run_ptr += size) <= end_ptr) - { - tmp_ptr = run_ptr - size; - while ((*cmp) ((void *) run_ptr, (void *) tmp_ptr, arg) < 0) - tmp_ptr -= size; - - tmp_ptr += size; - if (tmp_ptr != run_ptr) - { - char *trav; - - trav = run_ptr + size; - while (--trav >= run_ptr) - { - char c = *trav; - char *hi, *lo; - - for (hi = lo = trav; (lo -= size) >= tmp_ptr; hi = lo) - *hi = *lo; - *hi = c; - } - } - } - } + insertion_sort_qsort_partitions (pbase, total_elems, size, swap_type, cmp, + arg); } From patchwork Thu Sep 14 18:37:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 76001 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A6E0F3857707 for ; Thu, 14 Sep 2023 18:37:53 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oi1-x230.google.com (mail-oi1-x230.google.com [IPv6:2607:f8b0:4864:20::230]) by sourceware.org (Postfix) with ESMTPS id 0D3E0385CC84 for ; Thu, 14 Sep 2023 18:37:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0D3E0385CC84 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oi1-x230.google.com with SMTP id 5614622812f47-3ab3aceaf2aso751800b6e.2 for ; Thu, 14 Sep 2023 11:37:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1694716638; x=1695321438; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=sADUp0XaqB4TBVHySLrhuZFvYiKFPkZWZjsa2k1NPuM=; b=gZERv/Vwdr710rwIR5CXw7XDlhIRTiYu55lyAO4OO90oTuSLyYfz00wEUrn4VdNza9 pNBTRGkodE5XMi8WV6O2OUqafgocHl+ARzl2xpZAmx1Sm9MSB51FCnU+Xs/ECs4iKC9Z 6eAbEfadEM4TdLlDuWJZ8bqRPznoPB7dQQwHJ5uRWrRgbT4VlccZI8UIIg2wn+XPIYhC KfVoJJeqai8rapDGfn0/dxa5m3XSzOmcOvrQDIzMgZ/A9EPGKZVPto5gIRMmpQ1IXnrU uciuhPgsHumlsNI5quWRAZYbq90107/Ea97+cZD1eKLSYaVuYaMspryUPt0UZfRsVTeC EEHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694716638; x=1695321438; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sADUp0XaqB4TBVHySLrhuZFvYiKFPkZWZjsa2k1NPuM=; b=t/eIAajXNWNY/6/8JZ88HMp8qofklRUvdDHBu6lRVCb0HrjPDSiqPNtmoLqXHiQWyT Y29bFgVoZfzCkV3g9UD9YPGfnCtV1+ut8AaF/kGLeG9VqZ4N8Dr8m7qxUuL+cUmkVq4U 5m9922OhT9auH9EfuxmkeNfcAjdyrsjMqwLm2HygZlcaal9lIEm/lhda3exw+BLZ+SP3 MD2i2sqFV/kwf2wLKnp2GkoxxCuU0ZSXC94t1P7dz0RhkdJueSRX+SKgeH+eUMLwmOh9 OSbRyUGVvyXIulXF4zChP12NlmWl5N3ejpVliyMltnkc/Jn85EqW/ZQLSs1B0L2qYImU P3YA== X-Gm-Message-State: AOJu0YzjtlNIilpFQSwlrjy57AB2mFGLPEPYYy8h4PPd4q68RzF66Wnh wWSk3x74X+SpPs2bq/12AlkLZRfDS7CT/C3yrK0lfQ== X-Google-Smtp-Source: AGHT+IHjmgK43KYM/0MXojNcPIZmTBnkVGHysdTsf/hNI54U8LmDAIHG6v57nVsMvdvzdKchnrrJvg== X-Received: by 2002:a05:6870:7aa:b0:1bb:8333:ab8a with SMTP id en42-20020a05687007aa00b001bb8333ab8amr6374600oab.4.1694716638771; Thu, 14 Sep 2023 11:37:18 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:91cb:5173:d70:d8d:4d2c]) by smtp.gmail.com with ESMTPSA id cc6-20020a056871e18600b001b3d67934e9sm1056930oac.26.2023.09.14.11.37.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 11:37:17 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , Paul Eggert , Florian Weimer Subject: [PATCH v7 4/7] stdlib: qsort: Move some macros to inline function Date: Thu, 14 Sep 2023 15:37:01 -0300 Message-Id: <20230914183704.1386534-5-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> References: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org --- stdlib/qsort.c | 35 +++++++++++++++++++++++------------ 1 file changed, 23 insertions(+), 12 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 30e2999005..4082f5f9c1 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -99,15 +99,28 @@ typedef struct char *hi; } stack_node; -/* The next 4 #defines implement a very fast in-line stack abstraction. */ /* The stack needs log (total_elements) entries (we could even subtract log(MAX_THRESH)). Since total_elements has type size_t, we get as upper bound for log (total_elements): bits per byte (CHAR_BIT) * sizeof(size_t). */ -#define STACK_SIZE (CHAR_BIT * sizeof (size_t)) -#define PUSH(low, high) ((void) ((top->lo = (low)), (top->hi = (high)), ++top)) -#define POP(low, high) ((void) (--top, (low = top->lo), (high = top->hi))) -#define STACK_NOT_EMPTY (stack < top) +enum { STACK_SIZE = CHAR_BIT * sizeof (size_t) }; + +static inline stack_node * +push (stack_node *top, char *lo, char *hi) +{ + top->lo = lo; + top->hi = hi; + return ++top; +} + +static inline stack_node * +pop (stack_node *top, char **lo, char **hi) +{ + --top; + *lo = top->lo; + *hi = top->hi; + return top; +} static inline void @@ -211,11 +224,9 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, char *lo = base_ptr; char *hi = &lo[size * (total_elems - 1)]; stack_node stack[STACK_SIZE]; - stack_node *top = stack; - - PUSH (NULL, NULL); + stack_node *top = stack + 1; - while (STACK_NOT_EMPTY) + while (stack < top) { char *left_ptr; char *right_ptr; @@ -280,7 +291,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, { if ((size_t) (hi - left_ptr) <= max_thresh) /* Ignore both small partitions. */ - POP (lo, hi); + top = pop (top, &lo, &hi); else /* Ignore small left partition. */ lo = left_ptr; @@ -291,13 +302,13 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, else if ((right_ptr - lo) > (hi - left_ptr)) { /* Push larger left partition indices. */ - PUSH (lo, right_ptr); + top = push (top, lo, right_ptr); lo = left_ptr; } else { /* Push larger right partition indices. */ - PUSH (left_ptr, hi); + top = push (top, left_ptr, hi); hi = right_ptr; } } From patchwork Thu Sep 14 18:37:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 76002 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F0EB93836EA5 for ; Thu, 14 Sep 2023 18:37:54 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oi1-x22a.google.com (mail-oi1-x22a.google.com [IPv6:2607:f8b0:4864:20::22a]) by sourceware.org (Postfix) with ESMTPS id 54F043857704 for ; Thu, 14 Sep 2023 18:37:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 54F043857704 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oi1-x22a.google.com with SMTP id 5614622812f47-3aca0934e74so711307b6e.0 for ; Thu, 14 Sep 2023 11:37:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1694716641; x=1695321441; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=LVH9FPyqXhMnof20qz68Oz9ngaGXWyE1ZxEz6o7y/lg=; b=Lt4OziuYSifpujOmJQ7jzOMcnnVoPcGuqIyYKm4/oWOHHKYqXxfM6iLwvLnEu4tiHQ AIrWfD+HtGmRAn8doxd7PBv2T63Aw+obYdr/HpFRbvAWZQFsqrMLXkokIo/YVOey4D2P qSPC2lx6mnt67UybyrIXLvOV4ZxlnIlmnpNQiDMrO2MQ4JToo0W7B8ItOnu0K6pHhNlb kmM+8WtYzmL7XxQ6UmFhhtdUSwzH7IFWc07FDFhRZM3NrUXs/tgVJ0kmS1DrGPMNISx7 WO5splVsQezbirWQbaEIt9l5BDu8SUiwvnOhphoORY2yr68D9RFYCptwbWZfYwBvhvqH g2QQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694716641; x=1695321441; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LVH9FPyqXhMnof20qz68Oz9ngaGXWyE1ZxEz6o7y/lg=; b=eNgq7H338qfVLZLFgDrlaCoMYSrAxwukcNQuGOwFNs838yRt/1S+sEfj/0R/febfz/ JF3Alw2IXKrhUrbpbkcNGm8Io6v5e8w0Rk7aSH8+yrAxASjdzwXwhJbFoURErMqCy1Tt q/Fcm70pWX88b3wD8dJeNryS5HKpucdZdqv/lgcvUY3KkrV9o5OLLHPod6RPoZ2nmujJ bu2Za329Tc+uFI+qBive9hu4yTAo13yodSy4QjgEwPvwXcB6prOwDrIrKbwKD0y7bFjg SOdIbC4OEp8NAV+RFrlh9H2Eq8lWaLP25CUTHp+kNk0xvuDfm/HD/qE816Y+wRyFXDXs VKcw== X-Gm-Message-State: AOJu0Yx7WyZwYII3yL43PDvweNYyaH9qZMoWU7WHquOtC5GFA35rqo7q 9pWvBj44v5UL/aCDtk5cSAyutxuMrYNjpMi15wSbfw== X-Google-Smtp-Source: AGHT+IG+6F5vgxEaQeDt/jCENiBXCen3uWtB+IB0PVIGrLDo1h9CCKCsARz+plP3uQcbYhBRUgH0Ww== X-Received: by 2002:a05:6870:32d3:b0:1c0:fe16:90f8 with SMTP id r19-20020a05687032d300b001c0fe1690f8mr6837368oac.57.1694716641049; Thu, 14 Sep 2023 11:37:21 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:91cb:5173:d70:d8d:4d2c]) by smtp.gmail.com with ESMTPSA id cc6-20020a056871e18600b001b3d67934e9sm1056930oac.26.2023.09.14.11.37.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 11:37:20 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , Paul Eggert , Florian Weimer Subject: [PATCH v7 5/7] stdlib: Implement introsort for qsort (BZ 19305) Date: Thu, 14 Sep 2023 15:37:02 -0300 Message-Id: <20230914183704.1386534-6-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> References: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org This patch makes the quicksort implementation to acts as introsort, to avoid worse-case performance (and thus making it O(nlog n)). It switch to heapsort when the depth level reaches 2*log2(total elements). The heapsort is a textbook implementation. Checked on x86_64-linux-gnu. --- stdlib/qsort.c | 87 ++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 80 insertions(+), 7 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 4082f5f9c1..e160932657 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -97,6 +97,7 @@ typedef struct { char *lo; char *hi; + size_t depth; } stack_node; /* The stack needs log (total_elements) entries (we could even subtract @@ -106,22 +107,83 @@ typedef struct enum { STACK_SIZE = CHAR_BIT * sizeof (size_t) }; static inline stack_node * -push (stack_node *top, char *lo, char *hi) +push (stack_node *top, char *lo, char *hi, size_t depth) { top->lo = lo; top->hi = hi; + top->depth = depth; return ++top; } static inline stack_node * -pop (stack_node *top, char **lo, char **hi) +pop (stack_node *top, char **lo, char **hi, size_t *depth) { --top; *lo = top->lo; *hi = top->hi; + *depth = top->depth; return top; } +/* A fast, small, non-recursive O(nlog n) heapsort, adapted from Linux + lib/sort.c. Used on introsort implementation as a fallback routine with + worst-case performance of O(nlog n) and worst-case space complexity of + O(1). */ + +static inline void +siftdown (void *base, size_t size, size_t k, size_t n, + enum swap_type_t swap_type, __compar_d_fn_t cmp, void *arg) +{ + while (k <= n / 2) + { + size_t j = 2 * k; + if (j < n && cmp (base + (j * size), base + ((j + 1) * size), arg) < 0) + j++; + + if (cmp (base + (k * size), base + (j * size), arg) >= 0) + break; + + do_swap (base + (size * j), base + (k * size), size, swap_type); + k = j; + } +} + +static inline void +heapify (void *base, size_t size, size_t n, enum swap_type_t swap_type, + __compar_d_fn_t cmp, void *arg) +{ + size_t k = n / 2; + while (1) + { + siftdown (base, size, k, n, swap_type, cmp, arg); + if (k-- == 0) + break; + } +} + +static void +heapsort_r (void *base, void *end, size_t size, enum swap_type_t swap_type, + __compar_d_fn_t cmp, void *arg) +{ + const size_t count = ((uintptr_t) end - (uintptr_t) base) / size; + + if (count < 2) + return; + + size_t n = count - 1; + + /* Build the binary heap, largest value at the base[0]. */ + heapify (base, size, n, swap_type, cmp, arg); + + /* On each iteration base[0:n] is the binary heap, while base[n:count] + is sorted. */ + while (n > 0) + { + do_swap (base, base + (n * size), size, swap_type); + n--; + siftdown (base, size, 0, n, swap_type, cmp, arg); + } +} static inline void insertion_sort_qsort_partitions (void *const pbase, size_t total_elems, @@ -207,7 +269,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, const size_t max_thresh = MAX_THRESH * size; - if (total_elems == 0) + if (total_elems <= 1) /* Avoid lossage with unsigned arithmetic below. */ return; @@ -219,15 +281,26 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, else swap_type = SWAP_BYTES; + /* Maximum depth before quicksort switches to heapsort. */ + size_t depth = 2 * (sizeof (size_t) * CHAR_BIT - 1 + - __builtin_clzl (total_elems)); + if (total_elems > MAX_THRESH) { char *lo = base_ptr; char *hi = &lo[size * (total_elems - 1)]; stack_node stack[STACK_SIZE]; - stack_node *top = stack + 1; + stack_node *top = push (stack, NULL, NULL, depth); while (stack < top) { + if (depth == 0) + { + heapsort_r (lo, hi, size, swap_type, cmp, arg); + top = pop (top, &lo, &hi, &depth); + continue; + } + char *left_ptr; char *right_ptr; @@ -291,7 +364,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, { if ((size_t) (hi - left_ptr) <= max_thresh) /* Ignore both small partitions. */ - top = pop (top, &lo, &hi); + top = pop (top, &lo, &hi, &depth); else /* Ignore small left partition. */ lo = left_ptr; @@ -302,13 +375,13 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, else if ((right_ptr - lo) > (hi - left_ptr)) { /* Push larger left partition indices. */ - top = push (top, lo, right_ptr); + top = push (top, lo, right_ptr, depth - 1); lo = left_ptr; } else { /* Push larger right partition indices. */ - top = push (top, left_ptr, hi); + top = push (top, left_ptr, hi, depth - 1); hi = right_ptr; } } From patchwork Thu Sep 14 18:37:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 76003 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0D75A3853D2F for ; Thu, 14 Sep 2023 18:38:21 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x29.google.com (mail-oa1-x29.google.com [IPv6:2001:4860:4864:20::29]) by sourceware.org (Postfix) with ESMTPS id E6336385CC82 for ; Thu, 14 Sep 2023 18:37:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E6336385CC82 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oa1-x29.google.com with SMTP id 586e51a60fabf-1c504386374so754000fac.3 for ; Thu, 14 Sep 2023 11:37:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1694716644; x=1695321444; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=llw7zqHijMQZ7ttL5JQP68kCyMBi6d2oaS8V1jxVogk=; b=afSShPqevXwu1GTqqSkxtjmPAkOZnbrYwKcjluh5y3afIDi/eCYPn3DnQ2ciSlJoPY nfL49KoyxoObLJ88M3oNdfBtziKZqktySOulsgqt5RLsf+0alIQ7yyK6mKd6fJ7POqN0 JN84/cUdeYOLaDqNVe7EQvSNNajtdABucNK96vdvXY1mP5I2IPWCqUIQkPS1lK/oIaV0 RycfCIL8e1TbbZePGOWgCx4kk7Og4xAojY9QCqn7Fae7rpUHvDVQrA/kmzwaxaEgHOio Nnh2SbDamDAO6LLTCuSAyK/LDRy7B0PnPLVjXOwAlO7MCAS3LbO5elb0dvy0HXA6Wf0t fyeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694716644; x=1695321444; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=llw7zqHijMQZ7ttL5JQP68kCyMBi6d2oaS8V1jxVogk=; b=sWqJIP9DddZiJOyuGdJKv6IcnKKcLDCP6EEtmTfb/wvSp4FXCZcGR1BoUx1URp5xS6 /pGTMKLqZ9aQkZ6ngPAdpOyx1x/uQ91ZM0spOplubHluTplbTYNKyRgqDyXucy1DWPFX M68NXS+cqOj+DPfesH1dV2gT7QIukQS0SrObxpUYfAREO6LjbiQ27sd3ZJ0CufU1BWmb e2bae9xW2CuKNINBurp8jcjoD99SNPj+mpiiyhqWQQ5Vfg4fY0Uc+5Twqwpk9N97uqbn SXcl0dgvK4ocg/CBerdVdijBa+4XM1d/p8mBzrbK8sL3N4SYP+vHprFj4VoroKuafc2O Mqfg== X-Gm-Message-State: AOJu0YwmD+0D73ImtdrhTLokkVLUnyufvvlY+BY5i7/DuuqYIh9NSq5a BtT31VdtQn5BezvqhWVI4I7l2N90zrrEYaJc6uGAAQ== X-Google-Smtp-Source: AGHT+IFCkv1OHq1422HjIOfYXRUpsxqNXWfN6Wx62C3gUc7U8J8km7Cws81IXJGzOQxdrj1fyzjj9w== X-Received: by 2002:a05:6870:20a:b0:1be:fcb8:6a48 with SMTP id j10-20020a056870020a00b001befcb86a48mr6895722oad.58.1694716643613; Thu, 14 Sep 2023 11:37:23 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:91cb:5173:d70:d8d:4d2c]) by smtp.gmail.com with ESMTPSA id cc6-20020a056871e18600b001b3d67934e9sm1056930oac.26.2023.09.14.11.37.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 11:37:22 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , Paul Eggert , Florian Weimer Subject: [PATCH v7 6/7] stdlib: Remove use of mergesort on qsort (BZ 21719) Date: Thu, 14 Sep 2023 15:37:03 -0300 Message-Id: <20230914183704.1386534-7-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> References: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org This patch removes the mergesort optimization on qsort implementation and uses the introsort instead. The mergesort implementation has some issues: - It is as-safe only for certain types sizes (if total size is less than 1 KB with large element sizes also forcing memory allocation) which contradicts the function documentation. Although not required by the C standard, it is preferable and doable to have an O(1) space implementation. - The malloc for certain element size and element number adds arbitrary latency (might even be worse if malloc is interposed). - To avoid trigger swap from memory allocation the implementation relies on system information that might be virtualized (for instance VMs with overcommit memory) which might lead to potentially use of swap even if system advertise more memory than actually has. The check also have the downside of issuing syscalls where none is expected (although only once per execution). - The mergesort is suboptimal on an already sorted array (BZ#21719). The introsort implementation is already optimized to use constant extra space (due to the limit of total number of elements from maximum VM size) and thus can be used to avoid the malloc usage issues. Resulting performance is slower due the usage of qsort, specially in the worst-case scenario (partialy or sorted arrays) and due the fact mergesort uses a slight improved swap operations. This change also renders the BZ#21719 fix unrequired (since it is meant to fix the sorted input performance degradation for mergesort). The manual is also updated to indicate the function is now async-cancel safe. Checked on x86_64-linux-gnu. --- include/stdlib.h | 2 - manual/argp.texi | 2 +- manual/locale.texi | 3 +- manual/search.texi | 7 +- stdlib/Makefile | 2 - stdlib/msort.c | 309 --------------------------------------------- stdlib/qsort.c | 14 +- 7 files changed, 16 insertions(+), 323 deletions(-) delete mode 100644 stdlib/msort.c diff --git a/include/stdlib.h b/include/stdlib.h index 0ed8271d9b..580da9be15 100644 --- a/include/stdlib.h +++ b/include/stdlib.h @@ -149,8 +149,6 @@ extern int __posix_openpt (int __oflag) attribute_hidden; extern int __add_to_environ (const char *name, const char *value, const char *combines, int replace) attribute_hidden; -extern void _quicksort (void *const pbase, size_t total_elems, - size_t size, __compar_d_fn_t cmp, void *arg); extern int __on_exit (void (*__func) (int __status, void *__arg), void *__arg); diff --git a/manual/argp.texi b/manual/argp.texi index 0023441812..b77ad68285 100644 --- a/manual/argp.texi +++ b/manual/argp.texi @@ -735,7 +735,7 @@ for options, bad phase of the moon, etc. @c hol_set_group ok @c hol_find_entry ok @c hol_sort @mtslocale @acucorrupt -@c qsort dup @acucorrupt +@c qsort dup @c hol_entry_qcmp @mtslocale @c hol_entry_cmp @mtslocale @c group_cmp ok diff --git a/manual/locale.texi b/manual/locale.texi index 720e0ca952..f6afa5dc44 100644 --- a/manual/locale.texi +++ b/manual/locale.texi @@ -253,7 +253,7 @@ The symbols in this section are defined in the header file @file{locale.h}. @c calculate_head_size ok @c __munmap ok @c compute_hashval ok -@c qsort dup @acucorrupt +@c qsort dup @c rangecmp ok @c malloc @ascuheap @acsmem @c strdup @ascuheap @acsmem @@ -275,7 +275,6 @@ The symbols in this section are defined in the header file @file{locale.h}. @c realloc @ascuheap @acsmem @c realloc @ascuheap @acsmem @c fclose @ascuheap @asulock @acsmem @acsfd @aculock -@c qsort @ascuheap @acsmem @c alias_compare dup @c libc_lock_unlock @aculock @c _nl_explode_name @ascuheap @acsmem diff --git a/manual/search.texi b/manual/search.texi index 5691bf2f2b..a550858478 100644 --- a/manual/search.texi +++ b/manual/search.texi @@ -159,7 +159,7 @@ To sort an array using an arbitrary comparison function, use the @deftypefun void qsort (void *@var{array}, size_t @var{count}, size_t @var{size}, comparison_fn_t @var{compare}) @standards{ISO, stdlib.h} -@safety{@prelim{}@mtsafe{}@assafe{}@acunsafe{@acucorrupt{}}} +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} The @code{qsort} function sorts the array @var{array}. The array contains @var{count} elements, each of which is of size @var{size}. @@ -199,9 +199,8 @@ Functions}): The @code{qsort} function derives its name from the fact that it was originally implemented using the ``quick sort'' algorithm. -The implementation of @code{qsort} in this library might not be an -in-place sort and might thereby use an extra amount of memory to store -the array. +The implementation of @code{qsort} in this library is an in-place sort +and uses a constant extra space (allocated on the stack). @end deftypefun @node Search/Sort Example diff --git a/stdlib/Makefile b/stdlib/Makefile index 25e42a77e7..095518eef4 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -96,7 +96,6 @@ routines := \ mbtowc \ mrand48 \ mrand48_r \ - msort \ nrand48 \ nrand48_r \ old_atexit \ @@ -380,7 +379,6 @@ generated += \ # generated CFLAGS-bsearch.c += $(uses-callbacks) -CFLAGS-msort.c += $(uses-callbacks) CFLAGS-qsort.c += $(uses-callbacks) CFLAGS-system.c += -fexceptions CFLAGS-system.os = -fomit-frame-pointer diff --git a/stdlib/msort.c b/stdlib/msort.c deleted file mode 100644 index bbaa5e9f82..0000000000 --- a/stdlib/msort.c +++ /dev/null @@ -1,309 +0,0 @@ -/* An alternative to qsort, with an identical interface. - This file is part of the GNU C Library. - Copyright (C) 1992-2023 Free Software Foundation, Inc. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include -#include -#include -#include -#include -#include -#include - -struct msort_param -{ - size_t s; - size_t var; - __compar_d_fn_t cmp; - void *arg; - char *t; -}; -static void msort_with_tmp (const struct msort_param *p, void *b, size_t n); - -static void -msort_with_tmp (const struct msort_param *p, void *b, size_t n) -{ - char *b1, *b2; - size_t n1, n2; - - if (n <= 1) - return; - - n1 = n / 2; - n2 = n - n1; - b1 = b; - b2 = (char *) b + (n1 * p->s); - - msort_with_tmp (p, b1, n1); - msort_with_tmp (p, b2, n2); - - char *tmp = p->t; - const size_t s = p->s; - __compar_d_fn_t cmp = p->cmp; - void *arg = p->arg; - switch (p->var) - { - case 0: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (b1, b2, arg) <= 0) - { - *(uint32_t *) tmp = *(uint32_t *) b1; - b1 += sizeof (uint32_t); - --n1; - } - else - { - *(uint32_t *) tmp = *(uint32_t *) b2; - b2 += sizeof (uint32_t); - --n2; - } - tmp += sizeof (uint32_t); - } - break; - case 1: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (b1, b2, arg) <= 0) - { - *(uint64_t *) tmp = *(uint64_t *) b1; - b1 += sizeof (uint64_t); - --n1; - } - else - { - *(uint64_t *) tmp = *(uint64_t *) b2; - b2 += sizeof (uint64_t); - --n2; - } - tmp += sizeof (uint64_t); - } - break; - case 2: - while (n1 > 0 && n2 > 0) - { - unsigned long *tmpl = (unsigned long *) tmp; - unsigned long *bl; - - tmp += s; - if ((*cmp) (b1, b2, arg) <= 0) - { - bl = (unsigned long *) b1; - b1 += s; - --n1; - } - else - { - bl = (unsigned long *) b2; - b2 += s; - --n2; - } - while (tmpl < (unsigned long *) tmp) - *tmpl++ = *bl++; - } - break; - case 3: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (*(const void **) b1, *(const void **) b2, arg) <= 0) - { - *(void **) tmp = *(void **) b1; - b1 += sizeof (void *); - --n1; - } - else - { - *(void **) tmp = *(void **) b2; - b2 += sizeof (void *); - --n2; - } - tmp += sizeof (void *); - } - break; - default: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (b1, b2, arg) <= 0) - { - tmp = (char *) __mempcpy (tmp, b1, s); - b1 += s; - --n1; - } - else - { - tmp = (char *) __mempcpy (tmp, b2, s); - b2 += s; - --n2; - } - } - break; - } - - if (n1 > 0) - memcpy (tmp, b1, n1 * s); - memcpy (b, p->t, (n - n2) * s); -} - - -void -__qsort_r (void *b, size_t n, size_t s, __compar_d_fn_t cmp, void *arg) -{ - size_t size = n * s; - char *tmp = NULL; - struct msort_param p; - - /* For large object sizes use indirect sorting. */ - if (s > 32) - size = 2 * n * sizeof (void *) + s; - - if (size < 1024) - /* The temporary array is small, so put it on the stack. */ - p.t = __alloca (size); - else - { - /* We should avoid allocating too much memory since this might - have to be backed up by swap space. */ - static long int phys_pages; - static int pagesize; - - if (pagesize == 0) - { - phys_pages = __sysconf (_SC_PHYS_PAGES); - - if (phys_pages == -1) - /* Error while determining the memory size. So let's - assume there is enough memory. Otherwise the - implementer should provide a complete implementation of - the `sysconf' function. */ - phys_pages = (long int) (~0ul >> 1); - - /* The following determines that we will never use more than - a quarter of the physical memory. */ - phys_pages /= 4; - - /* Make sure phys_pages is written to memory. */ - atomic_write_barrier (); - - pagesize = __sysconf (_SC_PAGESIZE); - } - - /* Just a comment here. We cannot compute - phys_pages * pagesize - and compare the needed amount of memory against this value. - The problem is that some systems might have more physical - memory then can be represented with a `size_t' value (when - measured in bytes. */ - - /* If the memory requirements are too high don't allocate memory. */ - if (size / pagesize > (size_t) phys_pages) - { - _quicksort (b, n, s, cmp, arg); - return; - } - - /* It's somewhat large, so malloc it. */ - int save = errno; - tmp = malloc (size); - __set_errno (save); - if (tmp == NULL) - { - /* Couldn't get space, so use the slower algorithm - that doesn't need a temporary array. */ - _quicksort (b, n, s, cmp, arg); - return; - } - p.t = tmp; - } - - p.s = s; - p.var = 4; - p.cmp = cmp; - p.arg = arg; - - if (s > 32) - { - /* Indirect sorting. */ - char *ip = (char *) b; - void **tp = (void **) (p.t + n * sizeof (void *)); - void **t = tp; - void *tmp_storage = (void *) (tp + n); - - while ((void *) t < tmp_storage) - { - *t++ = ip; - ip += s; - } - p.s = sizeof (void *); - p.var = 3; - msort_with_tmp (&p, p.t + n * sizeof (void *), n); - - /* tp[0] .. tp[n - 1] is now sorted, copy around entries of - the original array. Knuth vol. 3 (2nd ed.) exercise 5.2-10. */ - char *kp; - size_t i; - for (i = 0, ip = (char *) b; i < n; i++, ip += s) - if ((kp = tp[i]) != ip) - { - size_t j = i; - char *jp = ip; - memcpy (tmp_storage, ip, s); - - do - { - size_t k = (kp - (char *) b) / s; - tp[j] = jp; - memcpy (jp, kp, s); - j = k; - jp = kp; - kp = tp[k]; - } - while (kp != ip); - - tp[j] = jp; - memcpy (jp, tmp_storage, s); - } - } - else - { - if ((s & (sizeof (uint32_t) - 1)) == 0 - && ((uintptr_t) b) % __alignof__ (uint32_t) == 0) - { - if (s == sizeof (uint32_t)) - p.var = 0; - else if (s == sizeof (uint64_t) - && ((uintptr_t) b) % __alignof__ (uint64_t) == 0) - p.var = 1; - else if ((s & (sizeof (unsigned long) - 1)) == 0 - && ((uintptr_t) b) - % __alignof__ (unsigned long) == 0) - p.var = 2; - } - msort_with_tmp (&p, b, n); - } - free (tmp); -} -libc_hidden_def (__qsort_r) -weak_alias (__qsort_r, qsort_r) - - -void -qsort (void *b, size_t n, size_t s, __compar_fn_t cmp) -{ - return __qsort_r (b, n, s, (__compar_d_fn_t) cmp, NULL); -} -libc_hidden_def (qsort) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index e160932657..470b09b10e 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -19,7 +19,6 @@ Engineering a sort function; Jon Bentley and M. Douglas McIlroy; Software - Practice and Experience; Vol. 23 (11), 1249-1265, 1993. */ -#include #include #include #include @@ -262,8 +261,8 @@ insertion_sort_qsort_partitions (void *const pbase, size_t total_elems, stack size is needed (actually O(1) in this case)! */ void -_quicksort (void *const pbase, size_t total_elems, size_t size, - __compar_d_fn_t cmp, void *arg) +__qsort_r (void *const pbase, size_t total_elems, size_t size, + __compar_d_fn_t cmp, void *arg) { char *base_ptr = (char *) pbase; @@ -395,3 +394,12 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, insertion_sort_qsort_partitions (pbase, total_elems, size, swap_type, cmp, arg); } +libc_hidden_def (__qsort_r) +weak_alias (__qsort_r, qsort_r) + +void +qsort (void *b, size_t n, size_t s, __compar_fn_t cmp) +{ + return __qsort_r (b, n, s, (__compar_d_fn_t) cmp, NULL); +} +libc_hidden_def (qsort) From patchwork Thu Sep 14 18:37:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 76004 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3BE113853D26 for ; Thu, 14 Sep 2023 18:38:44 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2d.google.com (mail-oa1-x2d.google.com [IPv6:2001:4860:4864:20::2d]) by sourceware.org (Postfix) with ESMTPS id 0BA15385DC0D for ; Thu, 14 Sep 2023 18:37:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0BA15385DC0D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oa1-x2d.google.com with SMTP id 586e51a60fabf-1bbb4bde76dso648381fac.2 for ; Thu, 14 Sep 2023 11:37:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1694716646; x=1695321446; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=7TcY49J55BYUyAkq/YRTuBz+HDpV/m8KrGvfXTPX2bw=; b=U8xrju3Bz6jCSZIQrwRC2/T+IjwNqQEGjNJzaxiPHGy+gBx0uteM9LgD1icllLRmzt oxgWKc35b5DC52mU0yamm7pIX7/nTzAzdgtMSgwDjeSvv8Ra5WG5F2o7tHBHIdb2QXsb L9kIn0fQy502qhdvcS3us8WLEF7WNWz2zwkb35R9iVEOzk7tX4U1YULCBtL8dkjMWc2t rfu7CR/e7IOiqs3cwQghvSFbzOpjBq83e9nX70N30aCIbDgoB+9elAGghdV5wig5AZl9 flooTfMj3+UK3RoGk1MGhbEx+MVcAg3AuWgijjipHGym/tcBZOrQ+13HjVkqLWtQHs9r NPVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694716646; x=1695321446; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7TcY49J55BYUyAkq/YRTuBz+HDpV/m8KrGvfXTPX2bw=; b=LI7aq3AyTlu4a1uRrNbAqovWDk99QBu9w6OPvxvsbg/e+KUKLtcY9+gzGgspk/LXyU NlUeXJ4YKx1PVEdZjZuTvwhQ2mulZBvAqk2nRU49ZG5K4d6lcmtyeqORvTZp/IT113rm N0nznC8a8eqSHirLEY3boMnoci1j4w4uGsewfGikJYFqwpKnbIBn+rD0OXWS9hcGfxBU JgO9F0TrmyvucfiHmfvx2q5y0sPYpSEvyfGpBr97aFgqQaHk6gisj79rGLtN6cwek+Z+ 6DnRkG0I6EpemuTh+9gkjfCBUSxhgjgvYIJEuWbEh3sWNKPLaLiABA9ziUpTyjAQAUdP 6y3A== X-Gm-Message-State: AOJu0YzQIFV7EopogPsMPeAQXjwjId2bLj0A0KW8Vy+TqNcyM/txAdci DAiY293ZlRSWkU7V9zxRQzKqijiBgTQR43bXFmh5pg== X-Google-Smtp-Source: AGHT+IF+qqlLNNye47KR5D0unZMKhMphEDA/YABCKEHWfq7ueg74MPF5SLM12jRMYl5dItluXl75Fw== X-Received: by 2002:a05:6870:32ca:b0:1ba:bb13:d007 with SMTP id r10-20020a05687032ca00b001babb13d007mr6985699oac.5.1694716645944; Thu, 14 Sep 2023 11:37:25 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:91cb:5173:d70:d8d:4d2c]) by smtp.gmail.com with ESMTPSA id cc6-20020a056871e18600b001b3d67934e9sm1056930oac.26.2023.09.14.11.37.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 11:37:25 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , Paul Eggert , Florian Weimer Subject: [PATCH v7 7/7] stdlib: Add more qsort{_r} coverage Date: Thu, 14 Sep 2023 15:37:04 -0300 Message-Id: <20230914183704.1386534-8-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> References: <20230914183704.1386534-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org This patch adds a qsort and qsort_r to trigger the worst case scenario for the quicksort (which glibc current lacks coverage). The test is done with random input, dfferent internal types (uint8_t, uint16_t, uint32_t, uint64_t, large size), and with different set of element numbers. Checked on x86_64-linux-gnu and i686-linux-gnu. --- stdlib/Makefile | 1 + stdlib/tst-qsort3.c | 366 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 367 insertions(+) create mode 100644 stdlib/tst-qsort3.c diff --git a/stdlib/Makefile b/stdlib/Makefile index 095518eef4..6af606136e 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -214,6 +214,7 @@ tests := \ tst-on_exit \ tst-qsort \ tst-qsort2 \ + tst-qsort3 \ tst-quick_exit \ tst-rand48 \ tst-rand48-2 \ diff --git a/stdlib/tst-qsort3.c b/stdlib/tst-qsort3.c new file mode 100644 index 0000000000..421560d744 --- /dev/null +++ b/stdlib/tst-qsort3.c @@ -0,0 +1,366 @@ +/* qsort(_r) tests to trigger worst case for quicksort. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +typedef enum +{ + Sorted, + Random, + Repeated, + Bitonic, + Duplicated, +} arraytype_t; + +/* Ratio of total of elements which will be repeated. */ +static const double RepeatedRatio = 0.2; + +/* Ratio of duplicated element . */ +static const double DuplicatedRatio = 0.4; + +struct array_t +{ + arraytype_t type; + const char *name; +} static const arraytypes[] = +{ + { Sorted, "Sorted" }, + { Random, "Random" }, + { Repeated, "Repeated" }, + { Bitonic, "Bitonic" }, + { Duplicated, "Duplicated" }, +}; + +/* Return the index of BASE as interpreted as an array of elements + of size SIZE. */ +static inline void * +arr (void *base, size_t idx, size_t size) +{ + return (void*)((uintptr_t)base + (idx * size)); +} + +/* Functions used to check qsort. */ +static int +uint8_t_cmp (const void *a, const void *b) +{ + uint8_t ia = *(uint8_t*)a; + uint8_t ib = *(uint8_t*)b; + return (ia > ib) - (ia < ib); +} + +static int +uint16_t_cmp (const void *a, const void *b) +{ + uint16_t ia = *(uint16_t*)a; + uint16_t ib = *(uint16_t*)b; + return (ia > ib) - (ia < ib); +} + +static int +uint32_t_cmp (const void *a, const void *b) +{ + uint32_t ia = *(uint32_t*)a; + uint32_t ib = *(uint32_t*)b; + return (ia > ib) - (ia < ib); +} + +static int +uint64_t_cmp (const void *a, const void *b) +{ + uint64_t ia = *(uint64_t*)a; + uint64_t ib = *(uint64_t*)b; + return (ia > ib) - (ia < ib); +} + +#define LARGE_SIZE 47 + +static int +large_cmp (const void *a, const void *b) +{ + return memcmp (a, b, LARGE_SIZE); +} + +/* Function used to check qsort_r. */ +typedef enum +{ + UINT8_CMP_T, + UINT16_CMP_T, + UINT32_CMP_T, + UINT64_CMP_T, + LARGE_CMP_T +} type_cmp_t; + +static type_cmp_t +uint_t_cmp_type (size_t sz) +{ + switch (sz) + { + case sizeof (uint8_t): return UINT8_CMP_T; + case sizeof (uint16_t): return UINT16_CMP_T; + case sizeof (uint64_t): return UINT64_CMP_T; + case sizeof (uint32_t): return UINT32_CMP_T; + default: return LARGE_CMP_T; + } +} + +static int +uint_t_cmp (const void *a, const void *b, void *arg) +{ + type_cmp_t type = *(type_cmp_t*) arg; + switch (type) + { + case UINT8_CMP_T: return uint8_t_cmp (a, b); + case UINT32_CMP_T: return uint32_t_cmp (a, b); + case UINT16_CMP_T: return uint16_t_cmp (a, b); + case UINT64_CMP_T: return uint64_t_cmp (a, b); + default: return large_cmp (a, b); + } +} + +static void +seq (void *elem, size_t type_size, int value) +{ + if (type_size == sizeof (uint8_t)) + *(uint8_t*)elem = value; + else if (type_size == sizeof (uint16_t)) + *(uint16_t*)elem = value; + else if (type_size == sizeof (uint32_t)) + *(uint32_t*)elem = value; + else if (type_size == sizeof (uint64_t)) + *(uint64_t*)elem = value; + else + memset (elem, value, type_size); +} + +static void +fill_array (void *array, void *refarray, size_t nmemb, size_t type_size, + arraytype_t type) +{ + size_t size = nmemb * type_size; + + switch (type) + { + case Sorted: + for (size_t i = 0; i < nmemb; i++) + seq (arr (array, i, type_size), type_size, i); + break; + + case Random: + arc4random_buf (array, size); + break; + + case Repeated: + { + arc4random_buf (array, size); + + void *randelem = xmalloc (type_size); + arc4random_buf (randelem, type_size); + + /* Repeat REPEATED elements (based on RepeatRatio ratio) in the random + array. */ + size_t repeated = (size_t)(nmemb * RepeatedRatio); + for (size_t i = 0; i < repeated; i++) + { + size_t pos = arc4random_uniform (nmemb - 1); + memcpy (arr (array, pos, type_size), randelem, type_size); + } + free (randelem); + } + break; + + case Bitonic: + { + size_t i; + for (i = 0; i < nmemb / 2; i++) + seq (arr (array, i, type_size), type_size, i); + for ( ; i < nmemb; i++) + seq (arr (array, i, type_size), type_size, (nmemb - 1) - i); + } + break; + + case Duplicated: + { + int randelem1 = arc4random (); + for (size_t i = 0; i < nmemb; i++) + seq (arr (array, i, type_size), type_size, randelem1); + + size_t duplicates = (size_t)(nmemb * DuplicatedRatio); + int randelem2 = arc4random (); + for (size_t i = 0; i < duplicates; i++) + { + size_t pos = arc4random_uniform (nmemb - 1); + seq (arr (array, pos, type_size), type_size, randelem2); + } + } + break; + } + + memcpy (refarray, array, size); +} + +typedef int (*cmpfunc_t)(const void *, const void *); + +/* Simple insertion sort to use as reference sort. */ +static void +qsort_r_ref (void *p, size_t n, size_t s, __compar_d_fn_t cmp, void *arg) +{ + if (n <= 1) + return; + + int i = 1; + char tmp[s]; + while (i < n) + { + memcpy (tmp, arr (p, i, s), s); + int j = i - 1; + while (j >= 0 && cmp (arr (p, j, s), tmp, arg) > 0) + { + memcpy (arr (p, j + 1, s), arr (p, j, s), s); + j = j - 1; + } + memcpy (arr (p, j + 1, s), tmp, s); + i = i + 1; + } +} + +static void +qsort_ref (void *b, size_t n, size_t s, __compar_fn_t cmp) +{ + return qsort_r_ref (b, n, s, (__compar_d_fn_t) cmp, NULL); +} + +/* Check if ARRAY of total NMEMB element of size SIZE is sorted + based on CMPFUNC. */ +static void +check_array (void *array, void *refarray, size_t nmemb, size_t type_size, + cmpfunc_t cmpfunc) +{ + for (size_t i = 1; i < nmemb; i++) + { + int ret = cmpfunc (arr (array, i, type_size), + arr (array, i-1, type_size)); + TEST_VERIFY_EXIT (ret >= 0); + } + + size_t size = nmemb * type_size; + TEST_COMPARE_BLOB (array, size, refarray, size); +} + +static void +check_qsort (void *buf, void *refbuf, size_t nelem, size_t type_size, + arraytype_t type, cmpfunc_t cmpfunc) +{ + fill_array (buf, refbuf, nelem, type_size, type); + + qsort (buf, nelem, type_size, cmpfunc); + qsort_ref (refbuf, nelem, type_size, cmpfunc); + + check_array (buf, refbuf, nelem, type_size, cmpfunc); +} + +static void +check_qsort_r (void *buf, void *refbuf, size_t nelem, size_t type_size, + arraytype_t type, cmpfunc_t cmpfunc) +{ + fill_array (buf, refbuf, nelem, type_size, type); + + type_cmp_t typecmp = uint_t_cmp_type (type_size); + + qsort_r (buf, nelem, type_size, uint_t_cmp, &typecmp); + qsort_r_ref (refbuf, nelem, type_size, uint_t_cmp, &typecmp); + + check_array (buf, refbuf, nelem, type_size, cmpfunc); +} + +static int +do_test (void) +{ + /* Some random sizes. */ + static const size_t nelems[] = { 0, 1, 7, 20, 32, 100, 256, 1024, 4256 }; + size_t max_nelems = 0; + for (int i = 0; i < array_length (nelems); i++) + if (nelems[i] > max_nelems) + max_nelems = nelems[i]; + + static const struct test_t + { + size_t type_size; + cmpfunc_t cmpfunc; + } + tests[] = + { + { sizeof (uint8_t), uint8_t_cmp }, + { sizeof (uint16_t), uint16_t_cmp }, + { sizeof (uint32_t), uint32_t_cmp }, + { sizeof (uint64_t), uint64_t_cmp }, + /* Test swap with large elements. */ + { LARGE_SIZE, large_cmp }, + }; + size_t max_type_size = 0; + for (int i = 0; i < array_length (tests); i++) + if (tests[i].type_size > max_type_size) + max_type_size = tests[i].type_size; + + void *buf = reallocarray (NULL, max_nelems, max_type_size); + TEST_VERIFY_EXIT (buf != NULL); + void *refbuf = reallocarray (NULL, max_nelems, max_type_size); + TEST_VERIFY_EXIT (refbuf != NULL); + + for (const struct test_t *test = tests; test < array_end (tests); ++test) + { + if (test_verbose > 0) + printf ("info: testing qsort with type_size=%zu\n", test->type_size); + for (const struct array_t *arraytype = arraytypes; + arraytype < array_end (arraytypes); + ++arraytype) + { + if (test_verbose > 0) + printf (" distribution=%s\n", arraytype->name); + for (const size_t *nelem = nelems; + nelem < array_end (nelems); + ++nelem) + { + if (test_verbose > 0) + printf (" nelem=%zu, total size=%zu\n", *nelem, + *nelem * test->type_size); + + check_qsort (buf, refbuf, *nelem, test->type_size, + arraytype->type, test->cmpfunc); + check_qsort_r (buf, refbuf, *nelem, test->type_size, + arraytype->type, test->cmpfunc); + } + } + } + + free (buf); + free (refbuf); + + return 0; +} + +#include