From patchwork Mon Sep 17 10:37:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 29414 Received: (qmail 55468 invoked by alias); 17 Sep 2018 10:37:27 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 55437 invoked by uid 89); 17 Sep 2018 10:37:27 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=23s, H*f:sk:HE1PR08, Dijkstra, dijkstra X-HELO: EUR04-DB3-obe.outbound.protection.outlook.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fi+xUTUqfsCpDMEULF8ekCphFCjZ5YxEaWnh+XsSciw=; b=VWl42YTt/l8HacWYGMnIdRkFawy9mJA6ShxsN00TVKv7sUMSIEhJK7qwShUr1Fq++APAMkkYQEpRjSMsK6OEBQtLQ87p11OJeqfjf07K9/mVeN1dA8fbSbunimerGSzBD6FmgzAp4YtPXN9Ybki3Y/Sgc1yYd0UFepHu7vSrASk= From: Wilco Dijkstra To: GLIBC Devel CC: nd Subject: Re: [PATCH v2] Fix strstr bug with huge needles (bug 23637) Date: Mon, 17 Sep 2018 10:37:18 +0000 Message-ID: References: , , In-Reply-To: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) MIME-Version: 1.0 ping   As reported in https://sourceware.org/ml/libc-help/2018-09/msg00000.html , the generic strstr in GLIBC 2.28 fails to match huge needles.  The optimized AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of repeatedly checking for the end of the string.  However if the needle length is larger than this, two_way_long_needle may confuse this as meaning the end of the string and return NULL.  This is fixed by adding the needle length to the amount to read ahead. Passes GLIBC regress on AArch64/x64, new tests pass. ChangeLog: 2018-08-13  Wilco Dijkstra          [BZ #23637]         * string/test-strstr.c (pr23637): New function.         (test_main): Add tests with longer needles.         * string/strcasestr.c (AVAILABLE): Fix readahead distance.         * string/strstr.c (AVAILABLE): Likewise. diff --git a/string/strcasestr.c b/string/strcasestr.c index 1f6b7b846f81d0d8fe77b390062d2d86be11b056..8aa76037dcc052f34ee4a8594e017ec822ce892b 100644 --- a/string/strcasestr.c +++ b/string/strcasestr.c @@ -37,8 +37,9 @@  /* Two-Way algorithm.  */  #define RETURN_TYPE char *  #define AVAILABLE(h, h_l, j, n_l)                       \ -  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \ -                             (j) + (n_l) <= (h_l))) +  (((j) + (n_l) <= (h_l)) \ +   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \ +       (j) + (n_l) <= (h_l)))  #define CHECK_EOL (1)  #define RET0_IF_0(a) if (!a) goto ret0  #define CANON_ELEMENT(c) TOLOWER (c) diff --git a/string/strstr.c b/string/strstr.c index 33acdc5442d3e9031298a56e4f52a19e2ffa7d84..f74d7189ed1319f6225525cc2d32380745de1523 100644 --- a/string/strstr.c +++ b/string/strstr.c @@ -33,8 +33,9 @@    #define RETURN_TYPE char *  #define AVAILABLE(h, h_l, j, n_l)                       \ -  (((j) + (n_l) <= (h_l)) || ((h_l) += __strnlen ((void*)((h) + (h_l)), 512), \ -                             (j) + (n_l) <= (h_l))) +  (((j) + (n_l) <= (h_l)) \ +   || ((h_l) += __strnlen ((void*)((h) + (h_l)), (n_l) + 512), \ +       (j) + (n_l) <= (h_l)))  #define CHECK_EOL (1)  #define RET0_IF_0(a) if (!a) goto ret0  #define FASTSEARCH(S,C,N) (void*) strchr ((void*)(S), (C)) diff --git a/string/test-strstr.c b/string/test-strstr.c index 8d99716ff39cc2c22ebebdacb5f30438f9b32ffc..5861b01b73e4c315c471d68300f2ba25e3533ac1 100644 --- a/string/test-strstr.c +++ b/string/test-strstr.c @@ -151,6 +151,32 @@ check2 (void)      }  }   +#define N 1024 + +static void +pr23637 (void) +{ +  char *h = (char*) buf1; +  char *n = (char*) buf2; + +  for (int i = 0; i < N; i++) +    { +      n[i] = 'x'; +      h[i] = ' '; +      h[i + N] = 'x'; +    } + +  n[N] = '\0'; +  h[N * 2] = '\0'; + +  /* Ensure we don't match at the first 'x'.  */ +  h[0] = 'x'; + +  char *exp_result = stupid_strstr (h, n); +  FOR_EACH_IMPL (impl, 0) +    check_result (impl, h, n, exp_result); +} +  static int  test_main (void)  { @@ -158,6 +184,7 @@ test_main (void)      check1 ();    check2 (); +  pr23637 ();      printf ("%23s", "");    FOR_EACH_IMPL (impl, 0) @@ -202,6 +229,9 @@ test_main (void)          do_test (15, 9, hlen, klen, 1);          do_test (15, 15, hlen, klen, 0);          do_test (15, 15, hlen, klen, 1); + +       do_test (15, 15, hlen + klen * 4, klen * 4, 0); +       do_test (15, 15, hlen + klen * 4, klen * 4, 1);        }      do_test (0, 0, page_size - 1, 16, 0);