From patchwork Tue Oct 10 13:34:26 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 23425 Received: (qmail 100360 invoked by alias); 10 Oct 2017 13:34:36 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 99768 invoked by uid 89); 10 Oct 2017 13:34:35 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=cancellation X-HELO: mx1.redhat.com DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com AB3FAC04AC6B Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=fweimer@redhat.com Subject: Re: [PATCH] nss_files: Avoid large buffers with many host addresses [BZ #22078] To: Adhemerval Zanella , libc-alpha@sourceware.org References: <20170904173210.3CE87439942E3@oldenburg.str.redhat.com> <2ed7620d-87be-74cd-f346-70f08e31a2d0@linaro.org> From: Florian Weimer Message-ID: <6430d3c4-578c-c6bd-3074-b511223747bf@redhat.com> Date: Tue, 10 Oct 2017 15:34:26 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <2ed7620d-87be-74cd-f346-70f08e31a2d0@linaro.org> On 09/05/2017 08:40 PM, Adhemerval Zanella wrote: >> +/* Type of the address and alias arrays. */ >> +#define DYNARRAY_STRUCT array >> +#define DYNARRAY_ELEMENT char * >> +#define DYNARRAY_PREFIX array_ >> +#include >> + > This will create a 16 elements vector as default (128 bytes). Should we > aim to use the default or can we use a slight large buffer to speed up > slightly generic case (assuming generic case use more than 16 entries)? 16 elements is actually too large, I wouldn't expect more than 1 or 2 elements in the typical case. But we already have a scratch buffer on the stack, so I don't think it makes sense to override the default size to reduce stack usage. The rebased patch simplifies the code a bit, by eliminating the new_data variable. We now always copy back the addresses and aliases from the dynamic arrays. In addition, this allows us to reuse the space internal_getent allocated to the aliases array. Thanks, Florian The previous implementation had at least a quadratic space requirement in the number of host addresses and aliases. 2017-10-10 Florian Weimer [BZ #22078] Avoid large NSS buffers with many addresses, aliases. * nss/nss_files/files-hosts.c (gethostbyname3_multi): Rewrite using dynarrays and struct alloc_buffer. * nss/Makefile (tests): Add tst-nss-files-hosts-multi. (tst-nss-files-hosts-multi): Link with -ldl. * nss/tst-nss-files-hosts-multi.c: New file. diff --git a/nss/Makefile b/nss/Makefile index f27bed11fc..26952112c1 100644 --- a/nss/Makefile +++ b/nss/Makefile @@ -63,6 +63,7 @@ xtests = bug-erange # Tests which need libdl ifeq (yes,$(build-shared)) tests += tst-nss-files-hosts-erange +tests += tst-nss-files-hosts-multi endif # If we have a thread library then we can test cancellation against @@ -167,3 +168,4 @@ $(objpfx)tst-cancel-getpwuid_r: $(shared-thread-library) endif $(objpfx)tst-nss-files-hosts-erange: $(libdl) +$(objpfx)tst-nss-files-hosts-multi: $(libdl) diff --git a/nss/nss_files/files-hosts.c b/nss/nss_files/files-hosts.c index 763fa39a47..6f7cc4d94b 100644 --- a/nss/nss_files/files-hosts.c +++ b/nss/nss_files/files-hosts.c @@ -23,6 +23,7 @@ #include #include #include +#include /* Get implementation for some internal functions. */ @@ -116,24 +117,45 @@ DB_LOOKUP (hostbyaddr, ,,, }, const void *addr, socklen_t len, int af) #undef EXTRA_ARGS_VALUE +/* Type of the address and alias arrays. */ +#define DYNARRAY_STRUCT array +#define DYNARRAY_ELEMENT char * +#define DYNARRAY_PREFIX array_ +#include + static enum nss_status gethostbyname3_multi (FILE * stream, const char *name, int af, struct hostent *result, char *buffer, size_t buflen, int *errnop, int *herrnop, int flags) { + assert (af == AF_INET || af == AF_INET6); + /* We have to get all host entries from the file. */ struct scratch_buffer tmp_buffer; scratch_buffer_init (&tmp_buffer); struct hostent tmp_result_buf; - int naddrs = 1; - int naliases = 0; - char *bufferend; + struct array addresses; + array_init (&addresses); + struct array aliases; + array_init (&aliases); enum nss_status status; - while (result->h_aliases[naliases] != NULL) - ++naliases; + /* Preserve the addresses and aliases encountered so far. */ + for (size_t i = 0; result->h_addr_list[i] != NULL; ++i) + array_add (&addresses, result->h_addr_list[i]); + for (size_t i = 0; result->h_aliases[i] != NULL; ++i) + array_add (&aliases, result->h_aliases[i]); - bufferend = (char *) &result->h_aliases[naliases + 1]; + /* The output buffer re-uses now-unused space at the end of the + buffer, starting with the aliases array. It comes last in the + data produced by internal_getent. (The alias names themselves + are still located in the line read in internal_getent, which is + stored at the beginning of the buffer.) */ + struct alloc_buffer outbuf; + { + char *bufferend = (char *) result->h_aliases; + outbuf = alloc_buffer_create (bufferend, buffer + buflen - bufferend); + } while (true) { @@ -170,46 +192,74 @@ gethostbyname3_multi (FILE * stream, const char *name, int af, } while ((matches = 0)); + /* If the line matches, we need to copy the addresses and + aliases, so that we can reuse tmp_buffer for the next + line. */ if (matches) { - /* We could be very clever and try to recycle a few bytes - in the buffer instead of generating new arrays. But - we are not doing this here since it's more work than - it's worth. Simply let the user provide a bit bigger - buffer. */ - char **new_h_addr_list; - char **new_h_aliases; - int newaliases = 0; - size_t newstrlen = 0; - int cnt; + /* Record the addresses. */ + for (size_t i = 0; tmp_result_buf.h_addr_list[i] != NULL; ++i) + { + /* Allocate the target space in the output buffer, + depending on the address family. */ + void *target; + if (af == AF_INET) + { + assert (tmp_result_buf.h_length == 4); + target = alloc_buffer_alloc (&outbuf, struct in_addr); + } + else if (af == AF_INET6) + { + assert (tmp_result_buf.h_length == 16); + target = alloc_buffer_alloc (&outbuf, struct in6_addr); + } + else + __builtin_unreachable (); + + if (target == NULL) + { + /* Request a larger output buffer. */ + *errnop = ERANGE; + *herrnop = NETDB_INTERNAL; + status = NSS_STATUS_TRYAGAIN; + break; + } + memcpy (target, tmp_result_buf.h_addr_list[i], + tmp_result_buf.h_length); + array_add (&addresses, target); + } - /* Count the new aliases and the length of the strings. */ - while (tmp_result_buf.h_aliases[newaliases] != NULL) + /* Record the aliases. */ + for (size_t i = 0; tmp_result_buf.h_aliases[i] != NULL; ++i) { - char *cp = tmp_result_buf.h_aliases[newaliases]; - ++newaliases; - newstrlen += strlen (cp) + 1; + char *alias = tmp_result_buf.h_aliases[i]; + array_add (&aliases, + alloc_buffer_copy_string (&outbuf, alias)); } - /* If the real name is different add it also to the - aliases. This means that there is a duplication - in the alias list but this is really the user's + + /* If the real name is different add, it also to the + aliases. This means that there is a duplication in + the alias list but this is really the user's problem. */ - if (strcmp (old_result->h_name, - tmp_result_buf.h_name) != 0) + { + char *new_name = tmp_result_buf.h_name; + if (strcmp (old_result->h_name, new_name) != 0) + array_add (&aliases, + alloc_buffer_copy_string (&outbuf, new_name)); + } + + /* Report memory allocation failures during the + expansion of the temporary arrays. */ + if (array_has_failed (&addresses) || array_has_failed (&aliases)) { - ++newaliases; - newstrlen += strlen (tmp_result_buf.h_name) + 1; + *errnop = ENOMEM; + *herrnop = NETDB_INTERNAL; + status = NSS_STATUS_UNAVAIL; + break; } - /* Make sure bufferend is aligned. */ - assert ((bufferend - (char *) 0) % sizeof (char *) == 0); - - /* Now we can check whether the buffer is large enough. - 16 is the maximal size of the IP address. */ - if (bufferend + 16 + (naddrs + 2) * sizeof (char *) - + roundup (newstrlen, sizeof (char *)) - + (naliases + newaliases + 1) * sizeof (char *) - >= buffer + buflen) + /* Request a larger output buffer if we ran out of room. */ + if (alloc_buffer_has_failed (&outbuf)) { *errnop = ERANGE; *herrnop = NETDB_INTERNAL; @@ -217,63 +267,6 @@ gethostbyname3_multi (FILE * stream, const char *name, int af, break; } - new_h_addr_list = - (char **) (bufferend - + roundup (newstrlen, sizeof (char *)) - + 16); - new_h_aliases = - (char **) ((char *) new_h_addr_list - + (naddrs + 2) * sizeof (char *)); - - /* Copy the old data in the new arrays. */ - for (cnt = 0; cnt < naddrs; ++cnt) - new_h_addr_list[cnt] = old_result->h_addr_list[cnt]; - - for (cnt = 0; cnt < naliases; ++cnt) - new_h_aliases[cnt] = old_result->h_aliases[cnt]; - - /* Store the new strings. */ - cnt = 0; - while (tmp_result_buf.h_aliases[cnt] != NULL) - { - new_h_aliases[naliases++] = bufferend; - bufferend = (__stpcpy (bufferend, - tmp_result_buf.h_aliases[cnt]) - + 1); - ++cnt; - } - - if (cnt < newaliases) - { - new_h_aliases[naliases++] = bufferend; - bufferend = __stpcpy (bufferend, - tmp_result_buf.h_name) + 1; - } - - /* Final NULL pointer. */ - new_h_aliases[naliases] = NULL; - - /* Round up the buffer end address. */ - bufferend += (sizeof (char *) - - ((bufferend - (char *) 0) - % sizeof (char *))) % sizeof (char *); - - /* Now the new address. */ - new_h_addr_list[naddrs++] = - memcpy (bufferend, tmp_result_buf.h_addr, - tmp_result_buf.h_length); - - /* Also here a final NULL pointer. */ - new_h_addr_list[naddrs] = NULL; - - /* Store the new array pointers. */ - old_result->h_aliases = new_h_aliases; - old_result->h_addr_list = new_h_addr_list; - - /* Compute the new buffer end. */ - bufferend = (char *) &new_h_aliases[naliases + 1]; - assert (bufferend <= buffer + buflen); - result = old_result; } /* If match was found. */ @@ -293,7 +286,47 @@ gethostbyname3_multi (FILE * stream, const char *name, int af, if (status != NSS_STATUS_TRYAGAIN) status = NSS_STATUS_SUCCESS; + if (status == NSS_STATUS_SUCCESS) + { + /* Copy the address and alias arrays into the output buffer and + add NULL terminators. The pointed-to elements were directly + written into the output buffer above and do not need to be + copied again. */ + size_t addresses_count = array_size (&addresses); + size_t aliases_count = array_size (&aliases); + char **out_addresses = alloc_buffer_alloc_array + (&outbuf, char *, addresses_count + 1); + char **out_aliases = alloc_buffer_alloc_array + (&outbuf, char *, aliases_count + 1); + if (out_addresses == NULL || out_aliases == NULL) + { + /* The output buffer is not large enough. */ + *errnop = ERANGE; + *herrnop = NETDB_INTERNAL; + status = NSS_STATUS_TRYAGAIN; + /* Fall through to function exit. */ + } + else + { + /* Everything is allocated in place. Make the copies and + adjust the array pointers. */ + memcpy (out_addresses, array_begin (&addresses), + addresses_count * sizeof (char *)); + out_addresses[addresses_count] = NULL; + memcpy (out_aliases, array_begin (&aliases), + aliases_count * sizeof (char *)); + out_aliases[aliases_count] = NULL; + + result->h_addr_list = out_addresses; + result->h_aliases = out_aliases; + + status = NSS_STATUS_SUCCESS; + } + } + scratch_buffer_free (&tmp_buffer); + array_free (&addresses); + array_free (&aliases); return status; } diff --git a/nss/tst-nss-files-hosts-multi.c b/nss/tst-nss-files-hosts-multi.c new file mode 100644 index 0000000000..195a19be4f --- /dev/null +++ b/nss/tst-nss-files-hosts-multi.c @@ -0,0 +1,331 @@ +/* Parse /etc/hosts in multi mode with many addresses/aliases. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct support_chroot *chroot_env; + +static void +prepare (int argc, char **argv) +{ + chroot_env = support_chroot_create + ((struct support_chroot_configuration) + { + .resolv_conf = "", + .hosts = "", /* See write_hosts below. */ + .host_conf = "multi on\n", + }); +} + +/* Create the /etc/hosts file from outside the chroot. */ +static void +write_hosts (int count) +{ + TEST_VERIFY (count > 0 && count <= 65535); + FILE *fp = xfopen (chroot_env->path_hosts, "w"); + fputs ("127.0.0.1 localhost localhost.localdomain\n" + "::1 localhost localhost.localdomain\n", + fp); + for (int i = 0; i < count; ++i) + { + fprintf (fp, "10.4.%d.%d www4.example.com\n", + (i / 256) & 0xff, i & 0xff); + fprintf (fp, "10.46.%d.%d www.example.com\n", + (i / 256) & 0xff, i & 0xff); + fprintf (fp, "192.0.2.1 alias.example.com v4-%d.example.com\n", i); + fprintf (fp, "2001:db8::6:%x www6.example.com\n", i); + fprintf (fp, "2001:db8::46:%x www.example.com\n", i); + fprintf (fp, "2001:db8::1 alias.example.com v6-%d.example.com\n", i); + } + xfclose (fp); +} + +/* Parameters of a single test. */ +struct test_params +{ + const char *name; /* Name to query. */ + const char *marker; /* Address marker for the name. */ + int count; /* Number of addresses/aliases. */ + int family; /* AF_INET, AF_INET_6 or AF_UNSPEC. */ + bool canonname; /* True if AI_CANONNAME should be enabled. */ +}; + +/* Expected result of gethostbyname/gethostbyname2. */ +static char * +expected_ghbn (const struct test_params *params) +{ + TEST_VERIFY (params->family == AF_INET || params->family == AF_INET6); + + struct xmemstream expected; + xopen_memstream (&expected); + if (strcmp (params->name, "alias.example.com") == 0) + { + fprintf (expected.out, "name: %s\n", params->name); + char af; + if (params->family == AF_INET) + af = '4'; + else + af = '6'; + for (int i = 0; i < params->count; ++i) + fprintf (expected.out, "alias: v%c-%d.example.com\n", af, i); + + for (int i = 0; i < params->count; ++i) + if (params->family == AF_INET) + fputs ("address: 192.0.2.1\n", expected.out); + else + fputs ("address: 2001:db8::1\n", expected.out); + } + else /* www/www4/www6 name. */ + { + bool do_ipv4 = params->family == AF_INET + && strncmp (params->name, "www6", 4) != 0; + bool do_ipv6 = params->family == AF_INET6 + && strncmp (params->name, "www4", 4) != 0; + if (do_ipv4 || do_ipv6) + { + fprintf (expected.out, "name: %s\n", params->name); + if (do_ipv4) + for (int i = 0; i < params->count; ++i) + fprintf (expected.out, "address: 10.%s.%d.%d\n", + params->marker, i / 256, i % 256); + if (do_ipv6) + for (int i = 0; i < params->count; ++i) + fprintf (expected.out, "address: 2001:db8::%s:%x\n", + params->marker, i); + } + else + fputs ("error: HOST_NOT_FOUND\n", expected.out); + } + xfclose_memstream (&expected); + return expected.buffer; +} + +/* Expected result of getaddrinfo. */ +static char * +expected_gai (const struct test_params *params) +{ + bool do_ipv4 = false; + bool do_ipv6 = false; + if (params->family == AF_UNSPEC) + do_ipv4 = do_ipv6 = true; + else if (params->family == AF_INET) + do_ipv4 = true; + else if (params->family == AF_INET6) + do_ipv6 = true; + + struct xmemstream expected; + xopen_memstream (&expected); + if (strcmp (params->name, "alias.example.com") == 0) + { + if (params->canonname) + fprintf (expected.out, + "flags: AI_CANONNAME\n" + "canonname: %s\n", + params->name); + + if (do_ipv4) + for (int i = 0; i < params->count; ++i) + fputs ("address: STREAM/TCP 192.0.2.1 80\n", expected.out); + if (do_ipv6) + for (int i = 0; i < params->count; ++i) + fputs ("address: STREAM/TCP 2001:db8::1 80\n", expected.out); + } + else /* www/www4/www6 name. */ + { + if (strncmp (params->name, "www4", 4) == 0) + do_ipv6 = false; + else if (strncmp (params->name, "www6", 4) == 0) + do_ipv4 = false; + /* Otherwise, we have www as the name, so we do both. */ + + if (do_ipv4 || do_ipv6) + { + if (params->canonname) + fprintf (expected.out, + "flags: AI_CANONNAME\n" + "canonname: %s\n", + params->name); + + if (do_ipv4) + for (int i = 0; i < params->count; ++i) + fprintf (expected.out, "address: STREAM/TCP 10.%s.%d.%d 80\n", + params->marker, i / 256, i % 256); + if (do_ipv6) + for (int i = 0; i < params->count; ++i) + fprintf (expected.out, + "address: STREAM/TCP 2001:db8::%s:%x 80\n", + params->marker, i); + } + else + fputs ("error: Name or service not known\n", expected.out); + } + xfclose_memstream (&expected); + return expected.buffer; +} + +static void +run_gbhn_gai (struct test_params *params) +{ + char *ctx = xasprintf ("name=%s marker=%s count=%d family=%d", + params->name, params->marker, params->count, + params->family); + if (test_verbose > 0) + printf ("info: %s\n", ctx); + + /* Check gethostbyname, gethostbyname2. */ + if (params->family == AF_INET) + { + char *expected = expected_ghbn (params); + check_hostent (ctx, gethostbyname (params->name), expected); + free (expected); + } + if (params->family != AF_UNSPEC) + { + char *expected = expected_ghbn (params); + check_hostent (ctx, gethostbyname2 (params->name, params->family), + expected); + free (expected); + } + + /* Check getaddrinfo. */ + for (int do_canonical = 0; do_canonical < 2; ++do_canonical) + { + params->canonname = do_canonical; + char *expected = expected_gai (params); + struct addrinfo hints = + { + .ai_family = params->family, + .ai_socktype = SOCK_STREAM, + .ai_protocol = IPPROTO_TCP, + }; + if (do_canonical) + hints.ai_flags |= AI_CANONNAME; + struct addrinfo *ai; + int ret = getaddrinfo (params->name, "80", &hints, &ai); + check_addrinfo (ctx, ai, ret, expected); + if (ret == 0) + freeaddrinfo (ai); + free (expected); + } + + free (ctx); +} + +/* Callback for the subprocess which runs the test in a chroot. */ +static void +subprocess (void *closure) +{ + struct test_params *params = closure; + + xchroot (chroot_env->path_chroot); + + static const int families[] = { AF_INET, AF_INET6, AF_UNSPEC, -1 }; + static const char *const names[] = + { + "www.example.com", "www4.example.com", "www6.example.com", + "alias.example.com", + NULL + }; + static const char *const names_marker[] = { "46", "4", "6", "" }; + + for (int family_idx = 0; families[family_idx] >= 0; ++family_idx) + { + params->family = families[family_idx]; + for (int names_idx = 0; names[names_idx] != NULL; ++names_idx) + { + params->name = names[names_idx]; + params->marker = names_marker[names_idx]; + run_gbhn_gai (params); + } + } +} + +/* Run the test for a specific number of addresses/aliases. */ +static void +run_test (int count) +{ + write_hosts (count); + + struct test_params params = + { + .count = count, + }; + + support_isolate_in_subprocess (subprocess, ¶ms); +} + +static int +do_test (void) +{ + support_become_root (); + if (!support_can_chroot ()) + return EXIT_UNSUPPORTED; + + /* This test should not use gigabytes of memory. */ + { + struct rlimit limit; + if (getrlimit (RLIMIT_AS, &limit) != 0) + { + printf ("getrlimit (RLIMIT_AS) failed: %m\n"); + return 1; + } + long target = 200 * 1024 * 1024; + if (limit.rlim_cur == RLIM_INFINITY || limit.rlim_cur > target) + { + limit.rlim_cur = target; + if (setrlimit (RLIMIT_AS, &limit) != 0) + { + printf ("setrlimit (RLIMIT_AS) failed: %m\n"); + return 1; + } + } + } + + __nss_configure_lookup ("hosts", "files"); + if (dlopen (LIBNSS_FILES_SO, RTLD_LAZY) == NULL) + FAIL_EXIT1 ("could not load " LIBNSS_DNS_SO ": %s", dlerror ()); + + /* Run the tests with a few different address/alias counts. */ + for (int count = 1; count <= 111; ++count) + run_test (count); + run_test (1111); + run_test (22222); + + support_chroot_free (chroot_env); + return 0; +} + +#define PREPARE prepare +#include