From patchwork Mon Jan 11 16:32:08 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Szabolcs Nagy X-Patchwork-Id: 10331 Received: (qmail 42343 invoked by alias); 11 Jan 2016 16:32:29 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 41119 invoked by uid 89); 11 Jan 2016 16:32:28 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 spammy=525, modifies, *result, resize X-HELO: eu-smtp-delivery-143.mimecast.com Message-ID: <5693D908.8090104@arm.com> Date: Mon, 11 Jan 2016 16:32:08 +0000 From: Szabolcs Nagy User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0 MIME-Version: 1.0 To: GNU C Library CC: , "triegel@redhat.com" , Subject: Re: [PATCH][BZ #19329] Fix race between tls allocation at thread creation and dlopen References: <568D5E11.3010301@arm.com> In-Reply-To: <568D5E11.3010301@arm.com> X-ClientProxiedBy: AM3PR04CA0118.eurprd04.prod.outlook.com (25.163.180.172) To VI1PR08MB0846.eurprd08.prod.outlook.com (25.164.93.144) X-Microsoft-Exchange-Diagnostics: 1; VI1PR08MB0846; 2:uwjP9kKFKInM7Pu6hmyUDioQtcFRUhSyyIeEuDQDHdMq7fxwvBQYOWya9rmtKq1zezFwc+d/5U8woeNdpNN7AKzVF7wf70DIe22g9TAk5bJ9dwebTsxpQOCeTBCdCDRCQpFiUCYzL8iiOMLtTOm34A==; 3:BG/llhFAvuijKxXwYdFpVsDlKgnjhNZn6rNThe53coTV5E5C9C9UiZT2W2Nj6QVQtWEYJntuPTX9LpCgpwvv6e7qwvHsmfeV2S3P9BQJGSViwOKpy4xbj/StNjqAmQBv; 25:l+PaNwhaJ0NzclYRScrB6cd9vcNhXjR9HFCrQBb78MH0vH3nbCTtOqdOIeFOdL9zLsUXtZlt4j2xNrX0xXHZZXiMfgYInoVituH/a4w8zftHY4BtjlMPYUPu6196uLp2GjDU+Z3bnEDMr/4S/EFBgD4MWQtA8cz1OVRjRHkA6+uVdD/1scEpnluAO2PR1SEmAknOyFgoKTy4RfpAdoGMpD+Ofoi+bQHtTUTpfhx0kzOOyD4tbfy5y4cuorYrxJPr; 20:UwwZ+7V8LqkWA5DoXeaX3O1aw+lF4A2DPuIJwlt4UsYWopSOkwy7ZOtpCBRTdxYEsyEjcecgPRtkU5oY+HIwSQbJmnnoVok4nTlfds5Od+uz3G060Vicxlt6nZJzhr1Fgh4pz9TUxhW+j6xi6MlEowmwL8rREDyvzMG/VHd2mGE= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:VI1PR08MB0846; X-MS-Office365-Filtering-Correlation-Id: ad71b152-25ee-4fda-5a40-08d31aa4c376 NoDisclaimer: True X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(180628864354917); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(102615245)(601004)(2401047)(5005006)(520078)(8121501046)(3002001)(10201501046); SRVR:VI1PR08MB0846; BCL:0; PCL:0; RULEID:; SRVR:VI1PR08MB0846; X-Microsoft-Exchange-Diagnostics: 1; VI1PR08MB0846; 4:sfH0ows3CUqZaOn0XS5+rJvEy8ei5oNp2WHdMOgvQlZSUCv0IjCBMnrMbyUY6BCzD2f/6pZlgVA+jWA9VWvktfnLJ4ichCdHWqkYO8gLCUjCoGtB144Hbfe84I8qQJqJ649De+vtFypRoZhm7Jg9sz2MevIHMTy+TjS//e/u/Z/l/HPOIsNnC7eMxnhEMuqle2H7P/6KJXCP6YjGXGuPri+CzgUWPEBI9SSvORleDJCvfHZTQhEFN+eKLHaeWg9gPwYyVKbspl5walOjEbLgdKGQSAFwQ3tHF3jZgK3ZEa4NybVondjKRi/OIs/kZwjwey9DcVB52lRPM9usmzwxYWRThvX7q/JqykuTEVnZDmuMLJJAF+EtziysF9VcoRgs71FwAZrOEAWHaDFz2O6NAxYXbfCBtsUVucymWEJYAC49+/25I7QTuxmcdKlbqDE/ X-Forefront-PRVS: 0818724663 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(6049001)(6009001)(189002)(377424004)(54534003)(199003)(4326007)(568964002)(5004730100002)(270700001)(2906002)(189998001)(5890100001)(65816999)(36756003)(2950100001)(64126003)(92566002)(21490400002)(77096005)(66066001)(5008740100001)(19580395003)(1096002)(84326002)(83506001)(586003)(19580405001)(3846002)(2476003)(80316001)(65956001)(105586002)(59896002)(33656002)(65806001)(86362001)(6116002)(110136002)(101416001)(5001960100002)(512874002)(81156007)(97736004)(87976001)(106356001)(4610100001)(87266999)(54356999)(122386002)(50986999)(76176999)(42186005)(40100003)(4001350100001); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR08MB0846; H:[10.2.206.73]; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; VI1PR08MB0846; 23:lJ9rfzG/dc7JLrTjZ+D81hmLi1ioqdAjB/LZbRBS6?= =?us-ascii?Q?01+0XXbyWjqpVtZvR7HGPROUnnfqZNfyLZ9xkfHpcz5w4IwNoWyrnoJmmuyN?= =?us-ascii?Q?80JEOEskHv+yMk73zU1ovxP6vQrzoLR7UHseP0J3zhkkUgkevkR5kjk27/XB?= =?us-ascii?Q?jtECFozQTtJ28kg+7VCeG11I1y9ZqGvCy4bJp0ZFyHUNGVoBlxB0bJZ2Wi10?= =?us-ascii?Q?eUwtrHM746GA0yipchQEe3DnRtZhTeWo2UVU6wiCIfw00veTY7P0ZRqYznX8?= =?us-ascii?Q?DLQvNeRQ5Zkn7HMLDB2hk1wlL2jw8xSHy9wvyMm8I2xaGbPxyWjy2IvCk5zj?= =?us-ascii?Q?nuwpNpTQbXfxBU12OeF7SWixXE9rkUChqF+1Zr7BcxEgzKaoXimCWlT5Hv/w?= =?us-ascii?Q?QCbbjzkGpWSIpfWam8WUOs7Oqo1Jbe6bHhTVE3VdgZFIdsmmvEtDmkX8uugN?= =?us-ascii?Q?fpjUec015iaNrBfEFyD8gJQkYKm6NvUiTtJd5Iv2QBjYO+oUU9kCsQWX8VLf?= =?us-ascii?Q?F2C6QakkRyzlU7RjBA13OFzWzAsFsPXgGPAalQhi1tR4BISQ+TXP+1sw6Gwm?= =?us-ascii?Q?SYEkBk6mp83dFx2hLxWWcHOj/xSiLq8Dy3kKhndv6NH5CU2Iqal52kw57938?= =?us-ascii?Q?mKt2akd1rvmw9fMtrQjMlaWmLpBHXEPmqkyis/WaYiEtY/4bcDkkIYS1xRYm?= =?us-ascii?Q?aTYCtvXu5IncIw49tJi/xzVjL0dhDF3Y5v8/DJMXxIXKDd55PckkWjVxuO2F?= =?us-ascii?Q?3XkbI44oJOI3fN53qEdDCZZ3XZ+OcMOvhddUJuJVpqJcHEC9zBZKm1NHntMU?= =?us-ascii?Q?l3Eqd7Q7VFP0LKcZ8Z7WeKOnmldDJ5dN2JtV2xsdcoEOfm5jZLlz2bsGqA42?= =?us-ascii?Q?TRJYBrt5rc7dueFrrhFg0btOk+1gSDxIbqWA1VXaAeJ5xSJGyBbIwxyql9TU?= =?us-ascii?Q?hw05uH30k5/LznnYYxMYYBRi49/bKnIAV46gSW+pmy5i8auaoBjXpZCnNu3C?= =?us-ascii?Q?1nIGYgCVsd209l+K7jwvF7UhbPYd1NksquABH+ohWuLjlwpmIzKO4WjLehDx?= =?us-ascii?Q?SKRK3W8lir5V/LRpV8/C33wWcxN7IrpwaSrKmpj1nq2XasG0sXmjJQOIGpsO?= =?us-ascii?Q?Lusc7b2KOBLT8ZzuKsoHGjU5BQu2l0GO49IyfS6hMZoX/8vWlWv4qYDLkJ+k?= =?us-ascii?Q?7QzGx85GwumwwKvd3+NQzx8dqwl4CY8k+8yDa0ouvAxzgf898+ya0f10KILb?= =?us-ascii?Q?Mc8ac5XFpf2JKFSRds+HmORWxJ8ZgJvhLpqHfV5qrTXtXUH8qC/kdOmxtulD?= =?us-ascii?Q?4yE/wnFXXYB/1rGVQwlppAYAkl3iLAEO63dxsGVo6c52sE5LcQhWpC8qKWW3?= =?us-ascii?Q?u7wgo7BlBjoZNiMovymKNoIjF1/goZFL5cbabnXcnzzNheL?= X-Microsoft-Exchange-Diagnostics: 1; VI1PR08MB0846; 5:8dMngbgcxCqXYFl03JIve8fKE0WyMfFNpZIBZHgDayOdS5SLICvCi7ut420YfuO59aIoRAvrsWYF2obdS92zMAs+r3kwrfayf7AxSUmTXZUj7bmHaK95wntEe4nzxakW6QQYdROCJPadcm4Xc9eC2w==; 24:24mrhOq9WtMaoxg/xYNdFdDVaz7MlmrExA6qArdrAsJ03zvf0s3exgqGlD8Atn9JFRp1odcBpXmvbgrXj+4zzGwDBOxLfIkkiZsTGdqEPsI=; 20:stqo/bmjtsay7Q5ACbsmVI1Mc+t2IpFCSEfkvQ+Zzy+UQtZNVVoTMnQzBlPuJqaJi8qejx3R44b3Ic3StSwnmiXTYJgA9kt5PBGSqowTq8KVdVif1ZISlXmD4NNA/ULujzBgEeZ1uPjegsbXGDHeh/IA9+JSg4mdnPeHeWnRPXQ= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Jan 2016 16:32:14.3300 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB0846 X-MC-Unique: KbT9JidyRE2h4Wdk8uLqoA-1 Changes in patch v2: * The listp->next race is fixed. * GL(dl_tls_generation) is loaded before GL(dl_tls_max_dtv_idx). (matters if several dlopen happens during _dl_allocate_tls_init between those two loads.) * more detailed analysis below. I can trigger the bug on x86_64 by loading a library with >50 deps with tls and concurrently creating >1000 threads, this was not turned into a glibc test case, but attached to the bug report. (On aarch64 the failure happens to be easier to trigger.) The following failures can be triggered: Inconsistency detected by ld.so: dl-tls.c: 493: _dl_allocate_tls_init: Assertion `listp->slotinfo[cnt].gen <= _rtld_local._dl_tls_generation' failed! and Inconsistency detected by ld.so: dl-tls.c: 525: _dl_allocate_tls_init: Assertion `listp != NULL' failed! dlopen modifies tls related dynamic linker data structures while holding the GL(dl_load_lock) if the loaded library has tls. Meanwhile at thread creation the same globals are accessed when the dtv and tls is set up for the new thread in _dl_allocate_tls_init without holding any locks. At least the following global objects may have conflicting access: GL(dl_tls_max_dtv_idx) GL(dl_tls_generation) listp->slotinfo[i].map listp->slotinfo[i].gen listp->next where listp points to a node in GL(dl_tls_dtv_slotinfo_list). The race window for the first assert failure above is short compared to dlopen and thread creation, so it rarely happens, but the probability can be increased if the loaded library has a lot of dependencies with tls, and if the deps don't fit into the same listp->slotinfo array then the second assert failure starts to happen. The current sequence of events is: dlopen: The modids of the loaded library and its dependencies are set one by one to ++GL(dl_tls_max_dtv_idx) (assuming they have tls). Then these modules are added to GL(dl_tls_dtv_slotinfo_list) one by one with the same generation number: GL(dl_tls_generation)+1. Then the GL(dl_tls_generation) is increased. pthread_create: The dtv for the thread is allocated assuming GL(dl_tls_max_dtv_idx) is the max modid that will be added to the dtv here. The GL(dl_tls_dtv_slotinfo_list) is walked to initialize dtv[modid] for all modules in the list and initialize the related tls. At the end dtv[0].counter is set to the max generation number seen, all modules with less-than-or-equal number must have initialized dtv and tls in this thread. The patch uses atomics to add modules to the slotinfo list and to increase GL(dl_tls_generation) during dlopen and similarly uses atomics to safely walk the slotinfo list in pthread_create. The dtv and tls are initialized for all modules up to the observed GL(dl_tls_generation), modules with larger generation number (concurrently loaded modules) are ignored. Further issues: dlclose also modifies the slotinfo list in unsafe ways and i don't immediately see a lockfree way to synchronize that with thread creation. (using the GL(dl_load_lock) at thread creation does not work because it can deadlock if pthread_create is called from a ctor while dlopen holds the lock.) i.e. this patch assumes dlclose is not called concurrently with pthread_create. Two (incorrect) assertions were removed from the code and i don't see an easy way to do similar runtime checks to guard against similar races or memory corruptions. I did not review all accesses to the problematic globals there are most likely other problems (e.g. GL(dl_tls_max_dtv_idx) is modified without atomics, tls access (__tls_get_addr) does not use atomics to access the same globals, etc) Glibc does not try to do worst-case allocation of tls for existing threads whenever a library is loaded so allocation might be needed at the first access to tls objects, making it non-as-safe and possibly oom crash. The dtv and tls of the modules that are ignored in this patch will be lazy initialized. Changelog: 2016-01-11 Szabolcs Nagy [BZ #19329] * elf/dl-open.c (dl_open_worker): Write GL(dl_tls_generation) atomically. * elf/dl-tls.c (_dl_allocate_tls_init): Read GL(dl_tls_generation), GL(dl_tls_max_dtv_idx), slotinfo entries and listp->next atomically. (_dl_add_to_slotinfo): Write the slotinfo entries and listp->next atomically. diff --git a/elf/dl-open.c b/elf/dl-open.c index 6f178b3..7315c3a 100644 --- a/elf/dl-open.c +++ b/elf/dl-open.c @@ -524,9 +524,15 @@ dl_open_worker (void *a) } /* Bump the generation number if necessary. */ - if (any_tls && __builtin_expect (++GL(dl_tls_generation) == 0, 0)) - _dl_fatal_printf (N_("\ + if (any_tls) + { + size_t newgen = GL(dl_tls_generation) + 1; + if (__builtin_expect (newgen == 0, 0)) + _dl_fatal_printf (N_("\ TLS generation counter wrapped! Please report this.")); + /* Synchronize with the load acquire in _dl_allocate_tls_init. */ + atomic_store_release (&GL(dl_tls_generation), newgen); + } /* We need a second pass for static tls data, because _dl_update_slotinfo must not be run while calls to _dl_add_to_slotinfo are still pending. */ diff --git a/elf/dl-tls.c b/elf/dl-tls.c index ed13fd9..0afbcab 100644 --- a/elf/dl-tls.c +++ b/elf/dl-tls.c @@ -455,9 +455,15 @@ _dl_allocate_tls_init (void *result) struct dtv_slotinfo_list *listp; size_t total = 0; size_t maxgen = 0; + size_t gen_count; + size_t dtv_slots; + /* Synchronize with dl_open_worker, concurrently loaded modules with + larger generation number should be ignored here. */ + gen_count = atomic_load_acquire (&GL(dl_tls_generation)); /* Check if the current dtv is big enough. */ - if (dtv[-1].counter < GL(dl_tls_max_dtv_idx)) + dtv_slots = atomic_load_relaxed (&GL(dl_tls_max_dtv_idx)); + if (dtv[-1].counter < dtv_slots) { /* Resize the dtv. */ dtv = _dl_resize_dtv (dtv); @@ -480,18 +486,23 @@ _dl_allocate_tls_init (void *result) void *dest; /* Check for the total number of used slots. */ - if (total + cnt > GL(dl_tls_max_dtv_idx)) + if (total + cnt > dtv_slots) break; - map = listp->slotinfo[cnt].map; + /* Synchronize with _dl_add_to_slotinfo. */ + map = atomic_load_acquire (&listp->slotinfo[cnt].map); if (map == NULL) /* Unused entry. */ continue; + size_t gen = listp->slotinfo[cnt].gen; + if (gen > gen_count) + /* New, concurrently loaded entry. */ + continue; + /* Keep track of the maximum generation number. This might not be the generation counter. */ - assert (listp->slotinfo[cnt].gen <= GL(dl_tls_generation)); - maxgen = MAX (maxgen, listp->slotinfo[cnt].gen); + maxgen = MAX (maxgen, gen); dtv[map->l_tls_modid].pointer.val = TLS_DTV_UNALLOCATED; dtv[map->l_tls_modid].pointer.is_static = false; @@ -518,11 +529,13 @@ _dl_allocate_tls_init (void *result) } total += cnt; - if (total >= GL(dl_tls_max_dtv_idx)) + if (total >= dtv_slots) break; - listp = listp->next; - assert (listp != NULL); + /* Synchronize with _dl_add_to_slotinfo. */ + listp = atomic_load_acquire (&listp->next); + if (listp == NULL) + break; } /* The DTV version is up-to-date now. */ @@ -916,7 +929,7 @@ _dl_add_to_slotinfo (struct link_map *l) the first slot. */ assert (idx == 0); - listp = prevp->next = (struct dtv_slotinfo_list *) + listp = (struct dtv_slotinfo_list *) malloc (sizeof (struct dtv_slotinfo_list) + TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo)); if (listp == NULL) @@ -939,9 +952,13 @@ cannot create TLS data structures")); listp->next = NULL; memset (listp->slotinfo, '\0', TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo)); + /* _dl_allocate_tls_init concurrently walks this list at thread + creation, it must only observe initialized nodes in the list. */ + atomic_store_release (&prevp->next, listp); } /* Add the information into the slotinfo data structure. */ - listp->slotinfo[idx].map = l; listp->slotinfo[idx].gen = GL(dl_tls_generation) + 1; + /* Synchronize with the acquire load in _dl_allocate_tls_init. */ + atomic_store_release (&listp->slotinfo[idx].map, l); }