From patchwork Mon Jun 22 15:15:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 39757 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DFB59395253A; Mon, 22 Jun 2020 15:15:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DFB59395253A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1592838923; bh=zyyje9TjFunexAiuqCnYKRID8CvPwqhc+xQje5qF9G0=; h=To:Subject:In-Reply-To:References:Date:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=EzetzkUj2ZygbhOtJjpUu+zNfPrX4h6yPwUSMS/8BQcCZRgZaveUr+PGenZRem9q3 j1VB16q7iOJ1ys8PMFTO/Tqk56FG5AlphiqjCZ8QPHeL6AF/i53Cf7mK2vB+/n6H26 kTsGHxgf9YFeRE9X7e5yl/awzMxXbcsafjJdBbn8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) by sourceware.org (Postfix) with ESMTP id B66343890401 for ; Mon, 22 Jun 2020 15:15:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B66343890401 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-54-7rU8_0dnPJag4B1pRJ-z9Q-1; Mon, 22 Jun 2020 11:15:17 -0400 X-MC-Unique: 7rU8_0dnPJag4B1pRJ-z9Q-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DD913EC1A6 for ; Mon, 22 Jun 2020 15:15:16 +0000 (UTC) Received: from oldenburg2.str.redhat.com (ovpn-112-185.ams2.redhat.com [10.36.112.185]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0BED610013D7 for ; Mon, 22 Jun 2020 15:15:15 +0000 (UTC) To: libc-alpha@sourceware.org Subject: [PATCH 26/30] elf: Implement a string table for ldconfig, with tail merging In-Reply-To: References: Message-Id: <6179006fc963179bb15828d7180b0fd836889a63.1592836143.git.fweimer@redhat.com> Date: Mon, 22 Jun 2020 17:15:14 +0200 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Florian Weimer via Libc-alpha From: Florian Weimer Reply-To: Florian Weimer Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" This will be used in ldconfig to reduce the ld.so.cache size slightly. --- elf/Makefile | 2 +- elf/stringtable.c | 201 +++++++++++++++++++++++++++++++++++++++++ elf/stringtable.h | 61 +++++++++++++ elf/stringtable_free.c | 32 +++++++ elf/tst-stringtable.c | 140 ++++++++++++++++++++++++++++ 5 files changed, 435 insertions(+), 1 deletion(-) create mode 100644 elf/stringtable.c create mode 100644 elf/stringtable.h create mode 100644 elf/stringtable_free.c create mode 100644 elf/tst-stringtable.c diff --git a/elf/Makefile b/elf/Makefile index 728cb3b734..06e57300c2 100644 --- a/elf/Makefile +++ b/elf/Makefile @@ -170,7 +170,7 @@ tests-container := \ tests := tst-tls9 tst-leaks1 \ tst-array1 tst-array2 tst-array3 tst-array4 tst-array5 \ - tst-auxv + tst-auxv tst-stringtable tests-internal := tst-tls1 tst-tls2 $(tests-static-internal) tests-static := $(tests-static-normal) $(tests-static-internal) diff --git a/elf/stringtable.c b/elf/stringtable.c new file mode 100644 index 0000000000..f9ade50249 --- /dev/null +++ b/elf/stringtable.c @@ -0,0 +1,201 @@ +/* String tables for ld.so.cache construction. Implementation. + This file is part of the GNU C Library. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published + by the Free Software Foundation; version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . */ + +#include +#include +#include +#include +#include +#include +#include + +static void +stringtable_init (struct stringtable *table) +{ + table->count = 0; + table->allocated = 16; + table->entries = xcalloc (table->allocated, sizeof (table->entries[0])); +} + +/* 32-bit FNV-1a hash function. */ +static uint32_t +fnv1a (const char *string, size_t length) +{ + const unsigned char *p = (const unsigned char *) string; + uint32_t hash = 2166136261U; + for (size_t i = 0; i < length; ++i) + { + hash ^= p[i]; + hash *= 16777619U; + } + return hash; +} + +/* Double the capacity of the hash table. */ +static void +stringtable_rehash (struct stringtable *table) +{ + /* Cannot overflow because the old allocation size (in bytes) is + larger. */ + uint32_t new_allocated = table->allocated * 2; + struct stringtable_entry **new_entries + = xcalloc (new_allocated, sizeof (table->entries[0])); + + uint32_t mask = new_allocated - 1; + for (uint32_t i = 0; i < table->allocated; ++i) + for (struct stringtable_entry *e = table->entries[i]; e != NULL; ) + { + struct stringtable_entry *next = e->next; + uint32_t hash = fnv1a (e->string, e->length); + uint32_t new_index = hash & mask; + e->next = new_entries[new_index]; + new_entries[new_index] = e; + e = next; + } + + free (table->entries); + table->entries = new_entries; + table->allocated = new_allocated; +} + +struct stringtable_entry * +stringtable_intern (struct stringtable *table, const char *string) +{ + if (table->allocated == 0) + stringtable_init (table); + + size_t length = strlen (string); + if (length > (1U << 30)) + error (EXIT_FAILURE, 0, _("String table string is too long")); + uint32_t hash = fnv1a (string, length); + + /* Return a previously-existing entry. */ + for (struct stringtable_entry *e + = table->entries[hash & (table->allocated - 1)]; + e != NULL; e = e->next) + if (e->length == length && memcmp (e->string, string, length) == 0) + return e; + + /* Increase the size of the table if necessary. Keep utilization + below two thirds. */ + if (table->count >= (1U << 30)) + error (EXIT_FAILURE, 0, _("String table has too many entries")); + if (table->count * 3 > table->allocated * 2) + stringtable_rehash (table); + + /* Add the new table entry. */ + ++table->count; + struct stringtable_entry *e + = xmalloc (offsetof (struct stringtable_entry, string) + length + 1); + uint32_t index = hash & (table->allocated - 1); + e->next = table->entries[index]; + table->entries[index] = e; + e->length = length; + e->offset = 0; + memcpy (e->string, string, length + 1); + return e; +} + +/* Sort reversed strings in lexicographic order. This is used for tail + merging. */ +static int +finalize_compare (const void *l, const void *r) +{ + struct stringtable_entry *left = *(struct stringtable_entry **) l; + struct stringtable_entry *right = *(struct stringtable_entry **) r; + size_t to_compare; + if (left->length < right->length) + to_compare = left->length; + else + to_compare = right->length; + for (ssize_t i = to_compare - 1; i >= 0; --i) + { + unsigned char lch = left->string[i]; + unsigned char rch = right->string[i]; + if (lch != rch) + return lch - rch; + } + if (left->length == right->length) + return 0; + else if (left->length < right->length) + /* Longer strings should come first. */ + return 1; + else + return -1; +} + +void +stringtable_finalize (struct stringtable *table, + struct stringtable_finalized *result) +{ + if (table->count == 0) + { + result->strings = xstrdup (""); + result->size = 0; + return; + } + + /* Optimize the order of the strings. */ + struct stringtable_entry **array = xcalloc (table->count, sizeof (*array)); + { + size_t j = 0; + for (uint32_t i = 0; i < table->allocated; ++i) + for (struct stringtable_entry *e = table->entries[i]; e != NULL; + e = e->next) + { + array[j] = e; + ++j; + } + assert (j == table->count); + } + qsort (array, table->count, sizeof (*array), finalize_compare); + + /* Assign offsets, using table sharing if possible. */ + array[0]->offset = 0; + for (uint32_t j = 1; j < table->count; ++j) + { + struct stringtable_entry *previous = array[j - 1]; + struct stringtable_entry *current = array[j]; + if (previous->length >= current->length + && memcmp (&previous->string[previous->length - current->length], + current->string, current->length) == 0) + current->offset = (previous->offset + previous->length + - current->length); + else if (__builtin_add_overflow (previous->offset, + previous->length + 1, + ¤t->offset)) + error (EXIT_FAILURE, 0, _("String table is too large")); + } + + /* Allocate the result string. */ + { + struct stringtable_entry *last = array[table->count - 1]; + if (__builtin_add_overflow (last->offset, last->length + 1, + &result->size)) + error (EXIT_FAILURE, 0, _("String table is too large")); + } + /* The strings are copied from the hash table, so the array is no + longer needed. */ + free (array); + result->strings = xcalloc (result->size, 1); + + /* Copy the strings. */ + for (uint32_t i = 0; i < table->allocated; ++i) + for (struct stringtable_entry *e = table->entries[i]; e != NULL; + e = e->next) + if (result->strings[e->offset] == '\0') + memcpy (&result->strings[e->offset], e->string, e->length + 1); +} diff --git a/elf/stringtable.h b/elf/stringtable.h new file mode 100644 index 0000000000..e35b6c67fd --- /dev/null +++ b/elf/stringtable.h @@ -0,0 +1,61 @@ +/* String tables for ld.so.cache construction. + This file is part of the GNU C Library. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published + by the Free Software Foundation; version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . */ + +#ifndef _STRINGTABLE_H +#define _STRINGTABLE_H + +#include +#include + +/* An entry in the string table. Only the length and string fields are + expected to be used outside the string table code. */ +struct stringtable_entry +{ + struct stringtable_entry *next; /* For collision resolution. */ + uint32_t length; /* Length of then string. */ + uint32_t offset; /* From start of finalized table. */ + char string[]; /* Null-terminated string. */ +}; + +/* A string table. Zero-initialization produces a valid atable. */ +struct stringtable +{ + struct stringtable_entry **entries; + uint32_t count; /* Number of elements in the table. */ + uint32_t allocated; /* Length of the entries array. */ +}; + +/* Adds STRING to TABLE. May return the address of an existing entry. */ +struct stringtable_entry *stringtable_intern (struct stringtable *table, + const char *string); + +/* Result of stringtable_finalize. SIZE bytes at STRINGS should be + written to the file. */ +struct stringtable_finalized +{ + char *strings; + size_t size; +}; + +/* Assigns offsets to string table entries and computes the serialized + form of the string table. */ +void stringtable_finalize (struct stringtable *table, + struct stringtable_finalized *result); + +/* Deallocate the string table (but not the TABLE pointer itself). */ +void stringtable_free (struct stringtable *table); + +#endif /* _STRINGTABLE_H */ diff --git a/elf/stringtable_free.c b/elf/stringtable_free.c new file mode 100644 index 0000000000..0e5296e429 --- /dev/null +++ b/elf/stringtable_free.c @@ -0,0 +1,32 @@ +/* String tables for ld.so.cache construction. Deallocation (for tests only). + This file is part of the GNU C Library. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published + by the Free Software Foundation; version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . */ + +#include +#include + +void +stringtable_free (struct stringtable *table) +{ + for (uint32_t i = 0; i < table->allocated; ++i) + for (struct stringtable_entry *e = table->entries[i]; e != NULL; ) + { + struct stringtable_entry *next = e->next; + free (e); + e = next; + } + free (table->entries); + *table = (struct stringtable) { 0, }; +} diff --git a/elf/tst-stringtable.c b/elf/tst-stringtable.c new file mode 100644 index 0000000000..78ca5434df --- /dev/null +++ b/elf/tst-stringtable.c @@ -0,0 +1,140 @@ +/* Unit test for ldconfig string tables. + This file is part of the GNU C Library. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published + by the Free Software Foundation; version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . */ + +#include +#include +#include +#include +#include + +static int +do_test (void) +{ + /* Empty string table. */ + { + struct stringtable s = { 0, }; + struct stringtable_finalized f; + stringtable_finalize (&s, &f); + TEST_COMPARE_STRING (f.strings, ""); + TEST_COMPARE (f.size, 0); + free (f.strings); + stringtable_free (&s); + } + + /* String table with one empty string. */ + { + struct stringtable s = { 0, }; + struct stringtable_entry *e = stringtable_intern (&s, ""); + TEST_COMPARE_STRING (e->string, ""); + TEST_COMPARE (e->length, 0); + TEST_COMPARE (s.count, 1); + + struct stringtable_finalized f; + stringtable_finalize (&s, &f); + TEST_COMPARE (e->offset, 0); + TEST_COMPARE_STRING (f.strings, ""); + TEST_COMPARE (f.size, 1); + free (f.strings); + stringtable_free (&s); + } + + /* String table with one non-empty string. */ + { + struct stringtable s = { 0, }; + struct stringtable_entry *e = stringtable_intern (&s, "name"); + TEST_COMPARE_STRING (e->string, "name"); + TEST_COMPARE (e->length, 4); + TEST_COMPARE (s.count, 1); + + struct stringtable_finalized f; + stringtable_finalize (&s, &f); + TEST_COMPARE (e->offset, 0); + TEST_COMPARE_STRING (f.strings, "name"); + TEST_COMPARE (f.size, 5); + free (f.strings); + stringtable_free (&s); + } + + /* Two strings, one is a prefix of the other. Tail-merging can only + happen in one way in this case. */ + { + struct stringtable s = { 0, }; + struct stringtable_entry *suffix = stringtable_intern (&s, "suffix"); + TEST_COMPARE_STRING (suffix->string, "suffix"); + TEST_COMPARE (suffix->length, 6); + TEST_COMPARE (s.count, 1); + + struct stringtable_entry *prefix + = stringtable_intern (&s, "prefix-suffix"); + TEST_COMPARE_STRING (prefix->string, "prefix-suffix"); + TEST_COMPARE (prefix->length, strlen ("prefix-suffix")); + TEST_COMPARE (s.count, 2); + + struct stringtable_finalized f; + stringtable_finalize (&s, &f); + TEST_COMPARE (prefix->offset, 0); + TEST_COMPARE (suffix->offset, strlen ("prefix-")); + TEST_COMPARE_STRING (f.strings, "prefix-suffix"); + TEST_COMPARE (f.size, sizeof ("prefix-suffix")); + free (f.strings); + stringtable_free (&s); + } + + /* String table with various shared prefixes. Triggers hash + resizing. */ + { + enum { count = 1500 }; + char *strings[2 * count]; + struct stringtable_entry *entries[2 * count]; + struct stringtable s = { 0, }; + for (int i = 0; i < count; ++i) + { + strings[i] = xasprintf ("%d", i); + entries[i] = stringtable_intern (&s, strings[i]); + TEST_COMPARE (entries[i]->length, strlen (strings[i])); + TEST_COMPARE_STRING (entries[i]->string, strings[i]); + strings[i + count] = xasprintf ("prefix/%d", i); + entries[i + count] = stringtable_intern (&s, strings[i + count]); + TEST_COMPARE (entries[i + count]->length, strlen (strings[i + count])); + TEST_COMPARE_STRING (entries[i + count]->string, strings[i + count]); + } + + struct stringtable_finalized f; + stringtable_finalize (&s, &f); + + for (int i = 0; i < 2 * count; ++i) + { + TEST_COMPARE (entries[i]->length, strlen (strings[i])); + TEST_COMPARE_STRING (entries[i]->string, strings[i]); + TEST_COMPARE_STRING (f.strings + entries[i]->offset, strings[i]); + free (strings[i]); + } + + free (f.strings); + stringtable_free (&s); + } + + return 0; +} + +#include + +/* Re-compile the string table implementation here. It is not + possible to link against the actual build because it was built for + use in ldconfig. */ +#define _(arg) arg +#include "stringtable.c" +#include "stringtable_free.c"