[v1] nptl: namespace-safe pthread keys implementation
Checks
| Context |
Check |
Description |
| redhat-pt-bot/TryBot-apply_patch |
success
|
Patch applied to master at the time it was sent
|
| redhat-pt-bot/TryBot-32bit |
fail
|
Patch caused testsuite regressions
|
| linaro-tcwg-bot/tcwg_glibc_build--master-arm |
success
|
Build passed
|
| linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 |
success
|
Build passed
|
| linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 |
fail
|
Test failed
|
| linaro-tcwg-bot/tcwg_glibc_check--master-arm |
fail
|
Test failed
|
Commit Message
This patch makes a couple of key changes to the pthread
thread-specific-data (tsd) key logic:
* Root data is in ld.so and not libc.so
* Malloc is not used for allocation
* No practical key limit
This primarily addresses the problems when dlmopen namespaces are used
and multiple copies of libc.so are loaded; previously there would be
two copies of the root data and thus keys could not be shared across
the namespace boundary.
The mechanism of the change is as follows:
(based on https://conf.gnu-tools-cauldron.org/opo25/talk/LQTU3G/)
The old method supported a fixed 1024 keys; the root data was a single
1024-length array and the thread data was [32][32] array with the
second level allocated with malloc. The new method uses a [32] array
for the first level like the old method, but the second level is
allocated with mmap, and the first two words hold the number of key
slots and a hint for finding free slots (the hint is unused in the
per-thread data, but remains to keep the code simpler and safer). A
key's 5 LSBs index the first level, and the remaining bits index the
second level. The first second level (i.e. [0][]) points to a small
fixed array in the per-thread area, like the old method, but this area
can now hold up to 62 keys (the sequence number is not used at this
time). The second second layer (i.e. [1][]) points to a single mmap'd
page and can hold 510 keys. Each additional second layer is double
the size of the previous, up to about half a trillion keys in the 31st
second layer.
The root data initially points at a structure in ld.so but if it's
NULL a backup copy in libc.so is used; this happens when statically
linked.
This implementation is, I admit, a bit inelegant. I focused more on
correctness than on performance, but this part of the code is new to
me and I kept finding little spots throughout the source tree that
"knew" about the logic, especially the nptl_db library - which I ended
up just commenting out as:
1. The internals are not documented
2. There are no test cases for it
3. I could find nothing that used those functions (not even gdb)
4. An AI-generated test case didn't even work with the old logic.
So with no users and no way to know if the code I was writing was
correct, I just left it empty.
-------------------- 8< --------------------
Comments
On 08/05/26 00:53, DJ Delorie wrote:
>
> This patch makes a couple of key changes to the pthread
> thread-specific-data (tsd) key logic:
>
> * Root data is in ld.so and not libc.so
> * Malloc is not used for allocation
> * No practical key limit
>
> This primarily addresses the problems when dlmopen namespaces are used
> and multiple copies of libc.so are loaded; previously there would be
> two copies of the root data and thus keys could not be shared across
> the namespace boundary.
>
> The mechanism of the change is as follows:
>
> (based on https://conf.gnu-tools-cauldron.org/opo25/talk/LQTU3G/)
>
> The old method supported a fixed 1024 keys; the root data was a single
> 1024-length array and the thread data was [32][32] array with the
> second level allocated with malloc. The new method uses a [32] array
> for the first level like the old method, but the second level is
> allocated with mmap, and the first two words hold the number of key
> slots and a hint for finding free slots (the hint is unused in the
> per-thread data, but remains to keep the code simpler and safer). A
> key's 5 LSBs index the first level, and the remaining bits index the
> second level. The first second level (i.e. [0][]) points to a small
> fixed array in the per-thread area, like the old method, but this area
> can now hold up to 62 keys (the sequence number is not used at this
> time). The second second layer (i.e. [1][]) points to a single mmap'd
> page and can hold 510 keys. Each additional second layer is double
> the size of the previous, up to about half a trillion keys in the 31st
> second layer.
This is a silent semantic incompatibility for any application that persists a key
value across exec, shared memory, or a library reload. We already have similar
change for pthread_cond, so it should be ok to make it.
>
> The root data initially points at a structure in ld.so but if it's
> NULL a backup copy in libc.so is used; this happens when statically
> linked.
>
> This implementation is, I admit, a bit inelegant. I focused more on
> correctness than on performance, but this part of the code is new to
> me and I kept finding little spots throughout the source tree that
> "knew" about the logic, especially the nptl_db library - which I ended
> up just commenting out as:
>
> 1. The internals are not documented
> 2. There are no test cases for it
> 3. I could find nothing that used those functions (not even gdb)
> 4. An AI-generated test case didn't even work with the old logic.
>
> So with no users and no way to know if the code I was writing was
> correct, I just left it empty.
The old design stored a generation sequence number in both the global slot and
the per-thread slot (KEY_UNUSED and KEY_USABLE). When a key was deleted and its
slot reused, per-thread values from the old key failed the sequence-number check
in pthread_getspecific and deallocate_tsd, and were silently ignored.
This new design replaces this with something like zero existing per-thread slots at
key creation. It is seems somewhat weaker:
* a thread calling pthread_setspecific(K, v) reads the global slot (outside any lock)
to validate K, then writes its per-thread slot. Between those two steps, another
thread can delete K and a third thread can create a new key that reuses the same slot.
The first thread's write then silently deposits a value for the new key.
* __nptl_deallocate_tsd reads the global destructor and the per-thread value with
no synchronisation between them. If the key is deleted and reused between those
two reads, the wrong destructor may be called with the wrong value.
I think either it will need to keep the generation counter or add synchronization
between pthread_setspecific and rely on delete-time zeroing. The extra synchronization
adds more synchronization what should be fast-path, so I am not sure which one is
preferable.
>
> -------------------- 8< --------------------
>
> diff --git a/elf/Makefile b/elf/Makefile
> index c835eb8156..83f9de2da4 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -74,6 +74,7 @@ dl-routines = \
> dl-open \
> dl-origin \
> dl-printf \
> + dl-pthread-keys \
> dl-readonly-area \
> dl-reloc \
> dl-runtime \
> diff --git a/elf/dl-pthread-keys.c b/elf/dl-pthread-keys.c
> new file mode 100644
> index 0000000000..481fbac464
> --- /dev/null
> +++ b/elf/dl-pthread-keys.c
> @@ -0,0 +1,21 @@
> +#include <unistd.h>
> +#include <ldsodefs.h>
> +#include <list.h>
> +#include <libc-lock.h>
> +#include "pthreadP.h"
> +
> +/* This struct must match the global definition in pthreadP.h so that
> + the slot counts match. */
> +static struct {
> + size_t num_slots;
> + size_t next_free_slot_hint;
> + void (*destrs[PTHREAD_KEY_ZEROBLOCK_COUNT])(void *);
> +} key_bucket_zero =
> + {
> + PTHREAD_KEY_ZEROBLOCK_COUNT, 0,
> + { [0 ... PTHREAD_KEY_ZEROBLOCK_COUNT-1] = PTHREAD_KEY_SLOT_FREE }
> + };
> +
> +/* Table of the key information. */
> +struct pthread_key_bucket *_dl_pthread_key_buckets_ldso[32] =
> + { (struct pthread_key_bucket *)&key_bucket_zero };
> diff --git a/elf/dl-support.c b/elf/dl-support.c
> index 0508d6113b..720d763c16 100644
> --- a/elf/dl-support.c
> +++ b/elf/dl-support.c
> @@ -176,6 +176,9 @@ list_t _dl_stack_cache;
> size_t _dl_stack_cache_actsize;
> uintptr_t _dl_in_flight_stack;
> int _dl_stack_cache_lock;
> +
> +extern struct pthread_key_bucket *_dl_pthread_key_buckets_ldso[32];
> +void *_dl_pthread_keys_data = _dl_pthread_key_buckets_ldso;
> #endif
> struct dl_scope_free_list *_dl_scope_free_list;
>
> diff --git a/elf/rtld.c b/elf/rtld.c
> index e926ec73e4..fdb7476dec 100644
> --- a/elf/rtld.c
> +++ b/elf/rtld.c
> @@ -729,6 +729,8 @@ match_version (const char *string, struct link_map *map)
>
> bool __rtld_tls_init_tp_called;
>
> +extern struct pthread_key_bucket *_dl_pthread_key_buckets_ldso[32];
> +
> static void *
> init_tls (size_t naudit)
> {
> @@ -775,6 +777,8 @@ cannot allocate TLS data structures for initial thread\n");
> so it knows not to pass this dtv to the normal realloc. */
> GL(dl_initial_dtv) = GET_DTV (tcbp);
>
> + GL(dl_pthread_keys_data) = _dl_pthread_key_buckets_ldso;
> +
> /* And finally install it for the main thread. */
> call_tls_init_tp (tcbp);
> __rtld_tls_init_tp_called = true;
> diff --git a/nptl/Makefile b/nptl/Makefile
> index 02862d1c04..bfb999b898 100644
> --- a/nptl/Makefile
> +++ b/nptl/Makefile
> @@ -123,7 +123,6 @@ routines = \
> pthread_join_common \
> pthread_key_create \
> pthread_key_delete \
> - pthread_keys \
> pthread_kill \
> pthread_kill_other_threads \
> pthread_mutex_cond_lock \
> @@ -321,6 +320,7 @@ tests = \
> tst-pthread-gdb-attach-static \
> tst-pthread-getcpuclockid-invalid \
> tst-pthread-key1-static \
> + tst-pthread-keys-ns \
> tst-pthread-timedlock-lockloop \
> tst-pthread_exit-nothreads \
> tst-pthread_exit-nothreads-static \
> @@ -485,6 +485,7 @@ modules-names = \
> tst-audit-threads-mod1 \
> tst-audit-threads-mod2 \
> tst-compat-forwarder-mod \
> + tst-pthread-keys-ns1 \
> tst-stack4mod \
> tst-tls-debug-mod \
> tst-tls3mod \
> @@ -687,6 +688,8 @@ $(objpfx)tst-tls6.out: tst-tls6.sh $(objpfx)tst-tls5 \
> $(evaluate-test)
> endif
>
> +$(objpfx)tst-pthread-keys-ns.out: $(objpfx)tst-pthread-keys-ns1.so
> +
> LDLIBS-tst-cancel24 = -Wl,--no-as-needed -lstdc++
> LDLIBS-tst-cancel24-static = $(LDLIBS-tst-cancel24)
>
> diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
> index b2ecb00113..559c160022 100644
> --- a/nptl/allocatestack.c
> +++ b/nptl/allocatestack.c
> @@ -36,6 +36,7 @@
> #include <intprops.h>
> #include <setvmaname.h>
>
> +
> /* Default alignment of stack. */
> #ifndef STACK_ALIGN
> # define STACK_ALIGN __alignof__ (long double)
> @@ -424,7 +425,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
> memset (pd, '\0', sizeof (struct pthread));
>
> /* The first TSD block is included in the TCB. */
> - pd->specific[0] = pd->specific_1stblock;
> + _pthread_key_init (pd);
>
> /* Remember the stack-related values. */
> pd->stackblock = (char *) stackaddr - size;
> @@ -548,7 +549,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
> /* We allocated the first block thread-specific data array.
> This address will not change for the lifetime of this
> descriptor. */
> - pd->specific[0] = pd->specific_1stblock;
> + _pthread_key_init (pd);
>
> /* This is at least the second thread. */
> pd->header.multiple_threads = 1;
> diff --git a/nptl/descr.h b/nptl/descr.h
> index 627cc3980f..687e1bb8e0 100644
> --- a/nptl/descr.h
> +++ b/nptl/descr.h
> @@ -57,9 +57,6 @@
> ((PTHREAD_KEYS_MAX + PTHREAD_KEY_2NDLEVEL_SIZE - 1) \
> / PTHREAD_KEY_2NDLEVEL_SIZE)
>
> -
> -
> -
> /* Internal version of the buffer to store cancellation handler
> information. */
> struct pthread_unwind_buf
> @@ -320,6 +317,12 @@ struct pthread
>
> /* We allocate one block of references here. This should be enough
> to avoid allocating any memory dynamically for most applications. */
> + /* Note that the new implementation doesn't use the seq, and treats
> + the whole block as a large array of pointers, with the first slot
> + being the number of pointers. We avoid changing the declaration
> + here to avoid compatibility issues. */
> + /* Renamed so I could find all the references to them and fix them,
> + but left in place to keep the struct the same size. */
> struct pthread_key_data
> {
> /* Sequence number. We use uintptr_t to not require padding on
> @@ -329,10 +332,12 @@ struct pthread
>
> /* Data pointer. */
> void *data;
> - } specific_1stblock[PTHREAD_KEY_2NDLEVEL_SIZE];
> + } specific_1stblock_dj[PTHREAD_KEY_2NDLEVEL_SIZE];
> +/* Number of keys in block zero of new structure. */
> +#define PTHREAD_KEY_ZEROBLOCK_COUNT (PTHREAD_KEY_2NDLEVEL_SIZE * 2 - 2)
>
> /* Two-level array for the thread-specific data. */
> - struct pthread_key_data *specific[PTHREAD_KEY_1STLEVEL_SIZE];
> + struct pthread_key_bucket *pthread_key_buckets[PTHREAD_KEY_1STLEVEL_SIZE];
>
> /* Flag which is set when specific data is set. */
> bool specific_used;
> diff --git a/nptl/nptl_deallocate_tsd.c b/nptl/nptl_deallocate_tsd.c
> index 4c3960b516..a8ddad0815 100644
> --- a/nptl/nptl_deallocate_tsd.c
> +++ b/nptl/nptl_deallocate_tsd.c
> @@ -15,97 +15,78 @@
> License along with the GNU C Library; if not, see
> <https://www.gnu.org/licenses/>. */
>
> +#include <ldsodefs.h>
> #include <pthreadP.h>
>
> -/* Deallocate POSIX thread-local-storage. */
> +static size_t page_size = 0;
> +#define bucket2mlen(b) (((page_size != 0) ? page_size : get_page_size()) << (b-1))
> +#define bucket2slots(b) ((bucket2mlen(b) - 2 * sizeof(size_t)) / sizeof (void *))
> +
> +static size_t
> +get_page_size(void)
> +{
> + page_size = EXEC_PAGESIZE;
> + return page_size;
> +}
> +
> +/* Called by each thread at its termination, to destruct its key
> + values. Atomics are not needed for thread-specific data, but locks
> + are needed on global data. */
> void
> __nptl_deallocate_tsd (void)
> {
> - struct pthread *self = THREAD_SELF;
> + struct pthread_key_bucket **destr_buckets;
> + struct pthread_key_bucket *bucket;
>
> /* Maybe no data was ever allocated. This happens often so we have
> a flag for this. */
> - if (THREAD_GETMEM (self, specific_used))
> + if (THREAD_GETMEM (THREAD_SELF, specific_used))
> {
> - size_t round;
> - size_t cnt;
> -
> - round = 0;
> - do
> - {
> - size_t idx;
> -
> - /* So far no new nonzero data entry. */
> - THREAD_SETMEM (self, specific_used, false);
> -
> - for (cnt = idx = 0; cnt < PTHREAD_KEY_1STLEVEL_SIZE; ++cnt)
> - {
> - struct pthread_key_data *level2;
> -
> - level2 = THREAD_GETMEM_NC (self, specific, cnt);
> -
> - if (level2 != NULL)
> - {
> - size_t inner;
> + int b, i, iters_left = 32;
> +
> + destr_buckets = GL(dl_pthread_keys_data);
> +
> + /* Destroy all current key values. */
> + while (--iters_left && THREAD_SELF->specific_used)
> + {
> + /* So far no new nonzero data entry. */
> + THREAD_SETMEM (THREAD_SELF, specific_used, false);
> + for (b = 0; b < 32; b ++)
> + {
> + bucket = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, b);
> +
> + if (bucket != NULL)
> + {
> + int slots = bucket->num_slots;
> + for (i = 0; i <slots; i ++)
> + {
> + void (*d)(void *) = atomic_load_relaxed (&destr_buckets[b]->keys[i]);
> + if (d != PTHREAD_KEY_SLOT_FREE && d != NULL
> + && bucket->keys[i] != NULL)
Wouldn't this skip keys created with dest == NULL? I think get_cached_stack might reuses it
for a new thread without calling _pthread_key_init.
> + {
> + const void *v = bucket->keys[i];
> + bucket->keys[i] = NULL;
> + d ((void *) v);
> + }
> + }
> + }
> + }
> + }
> +
> + /* Unmap all mmap'd buckets. */
> + for (b = 1; b < 32; b ++)
> + {
> + bucket = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, b);
> + if (bucket != NULL)
> + {
> + size_t mlen = bucket2mlen (b);
> + __munmap (bucket, mlen);
> + THREAD_SETMEM_NC (THREAD_SELF, pthread_key_buckets, b, 0);
> + }
> + }
>
> - for (inner = 0; inner < PTHREAD_KEY_2NDLEVEL_SIZE;
> - ++inner, ++idx)
> - {
> - void *data = level2[inner].data;
> -
> - if (data != NULL)
> - {
> - /* Always clear the data. */
> - level2[inner].data = NULL;
> -
> - /* Make sure the data corresponds to a valid
> - key. This test fails if the key was
> - deallocated and also if it was
> - re-allocated. It is the user's
> - responsibility to free the memory in this
> - case. */
> - if (level2[inner].seq
> - == __pthread_keys[idx].seq
> - /* It is not necessary to register a destructor
> - function. */
> - && __pthread_keys[idx].destr != NULL)
> - /* Call the user-provided destructor. */
> - __pthread_keys[idx].destr (data);
> - }
> - }
> - }
> - else
> - idx += PTHREAD_KEY_1STLEVEL_SIZE;
> - }
> -
> - if (THREAD_GETMEM (self, specific_used) == 0)
> - /* No data has been modified. */
> - goto just_free;
> - }
> - /* We only repeat the process a fixed number of times. */
> - while (__builtin_expect (++round < PTHREAD_DESTRUCTOR_ITERATIONS, 0));
> -
> - /* Just clear the memory of the first block for reuse. */
> - memset (&THREAD_SELF->specific_1stblock, '\0',
> - sizeof (self->specific_1stblock));
> -
> - just_free:
> - /* Free the memory for the other blocks. */
> - for (cnt = 1; cnt < PTHREAD_KEY_1STLEVEL_SIZE; ++cnt)
> - {
> - struct pthread_key_data *level2;
> -
> - level2 = THREAD_GETMEM_NC (self, specific, cnt);
> - if (level2 != NULL)
> - {
> - /* The first block is allocated as part of the thread
> - descriptor. */
> - free (level2);
> - THREAD_SETMEM_NC (self, specific, cnt, NULL);
> - }
> - }
> -
> - THREAD_SETMEM (self, specific_used, false);
> }
> + THREAD_SETMEM (THREAD_SELF, specific_used, false);
> }
> +
> libc_hidden_def (__nptl_deallocate_tsd)
> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
> index 9a0cefb0f5..4bbf95893c 100644
> --- a/nptl/pthread_create.c
> +++ b/nptl/pthread_create.c
> @@ -434,6 +434,7 @@ start_thread (void *arg)
> {
> /* Store the new cleanup handler info. */
> THREAD_SETMEM (pd, cleanup_jmp_buf, &unwind_buf);
> + assert (pd == THREAD_SELF);
>
> internal_signal_restore_set (&pd->sigmask);
>
> diff --git a/nptl/pthread_getspecific.c b/nptl/pthread_getspecific.c
> index 136f4493ed..d8d304ee3b 100644
> --- a/nptl/pthread_getspecific.c
> +++ b/nptl/pthread_getspecific.c
> @@ -16,51 +16,43 @@
> <https://www.gnu.org/licenses/>. */
>
> #include <stdlib.h>
> -#include "pthreadP.h"
> #include <shlib-compat.h>
>
> +#include "pthreadP.h"
> +#include "ldsodefs.h"
> +
> +extern void *___ldso_pthread_getspecific (pthread_key_t key)
> + weak_function;
> +
> void *
> ___pthread_getspecific (pthread_key_t key)
> {
> - struct pthread_key_data *data;
> -
> - /* Special case access to the first 2nd-level block. This is the
> - usual case. */
> - if (__glibc_likely (key < PTHREAD_KEY_2NDLEVEL_SIZE))
> - data = &THREAD_SELF->specific_1stblock[key];
> - else
> - {
> - /* Verify the key is sane. */
> - if (key >= PTHREAD_KEYS_MAX)
> - /* Not valid. */
> - return NULL;
> + int b = PTHREAD_KEY_DECODE_BUCKET(key);
> + int i = PTHREAD_KEY_DECODE_SLOT(key);
> + struct pthread_key_bucket **buckets = GL(dl_pthread_keys_data);
>
> - unsigned int idx1st = key / PTHREAD_KEY_2NDLEVEL_SIZE;
> - unsigned int idx2nd = key % PTHREAD_KEY_2NDLEVEL_SIZE;
> + /* The spec doesn't allow for errors, so remove these checks after
> + development? */
> + if (buckets == NULL)
> + return NULL;
> + if (buckets[b] == NULL)
> + return NULL;
>
> - /* If the sequence number doesn't match or the key cannot be defined
> - for this thread since the second level array is not allocated
> - return NULL, too. */
> - struct pthread_key_data *level2 = THREAD_GETMEM_NC (THREAD_SELF,
> - specific, idx1st);
> - if (level2 == NULL)
> - /* Not allocated, therefore no data. */
> - return NULL;
> + struct pthread_key_bucket *bucket = buckets[b];
> + if (i < 0 || i >= bucket->num_slots)
> + return NULL;
>
> - /* There is data. */
> - data = &level2[idx2nd];
> - }
> + if (bucket->keys[i] == PTHREAD_KEY_SLOT_FREE)
> + return NULL;
>
> - void *result = data->data;
> - if (result != NULL)
> - {
> - uintptr_t seq = data->seq;
> + bucket = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, b);
>
> - if (__glibc_unlikely (seq != __pthread_keys[key].seq))
> - result = data->data = NULL;
> - }
> + /* key created before this thread, and the mapping is not yet
> + created. */
> + if (bucket == NULL)
> + return NULL;
>
> - return result;
> + return bucket->keys[i];
> }
> versioned_symbol (libc, ___pthread_getspecific, pthread_getspecific,
> GLIBC_2_34);
> diff --git a/nptl/pthread_key_create.c b/nptl/pthread_key_create.c
> index e4c963bbc6..f23b9cb114 100644
> --- a/nptl/pthread_key_create.c
> +++ b/nptl/pthread_key_create.c
> @@ -15,37 +15,152 @@
> License along with the GNU C Library; if not, see
> <https://www.gnu.org/licenses/>. */
>
> +#include <assert.h>
> #include <errno.h>
> -#include "pthreadP.h"
> +#include <list.h>
> #include <atomic.h>
> #include <shlib-compat.h>
>
> +#include "ldsodefs.h"
> +#include "pthreadP.h"
> +
> +_Static_assert (PTHREAD_KEY_1STLEVEL_SIZE >= 32, "incompatible TSD sizes");
> +
> +__libc_lock_define_initialized_recursive (, __pthread_key_lock)
> +
> +/* This struct must match the global definition in pthreadP.h so that
> + the slot counts match. */
> +static struct {
> + size_t num_slots;
> + size_t next_free_slot_hint;
> + void (*destrs[PTHREAD_KEY_ZEROBLOCK_COUNT])(void *);
> +} key_bucket_zero =
> + {
> + PTHREAD_KEY_ZEROBLOCK_COUNT, 0,
> + { [0 ... PTHREAD_KEY_ZEROBLOCK_COUNT-1] = PTHREAD_KEY_SLOT_FREE }
> + };
> +
> +/* Table of the key information. */
> +static struct pthread_key_bucket *__pthread_key_buckets_local[32] =
> + { (struct pthread_key_bucket *)&key_bucket_zero };
> +
> +static size_t page_size = 0;
> +#define bucket2mlen(b) (((page_size != 0) ? page_size : get_page_size()) << (b-1))
> +#define bucket2slots(b) ((bucket2mlen(b) - 2 * sizeof(size_t)) / sizeof (void *))
> +
> +static size_t
> +get_page_size(void)
> +{
> + page_size = EXEC_PAGESIZE;
> + return page_size;
> +}
> +
> +void
> +_pthread_key_init (struct pthread *pd)
> +{
> + /* Initialize global key data. */
> + if (GL(dl_pthread_keys_data) == NULL)
> + GL(dl_pthread_keys_data) = __pthread_key_buckets_local;
> +
> + /* Initialize thread key data. */
> + struct pthread_key_bucket *bucket
> + = (struct pthread_key_bucket *) & (pd->specific_1stblock_dj);
> + bucket->num_slots = PTHREAD_KEY_ZEROBLOCK_COUNT;
> + bucket->next_free_slot_hint = 0;
> + memset (bucket->keys, 0, PTHREAD_KEY_ZEROBLOCK_COUNT * sizeof (void *));
> + pd->pthread_key_buckets[0] = bucket;
> +}
> +libc_hidden_def (_pthread_key_init)
> +
> +
> int
> ___pthread_key_create (pthread_key_t *key, void (*destr) (void *))
> {
> - /* Find a slot in __pthread_keys which is unused. */
> - for (size_t cnt = 0; cnt < PTHREAD_KEYS_MAX; ++cnt)
> + int b, i;
> + list_t *runp;
> +
> + PTHREAD_KEY_LOCK;
> + struct pthread_key_bucket **buckets = GL(dl_pthread_keys_data);
> + if (buckets == NULL)
I think this does not address the namespace issue you are trying to fix.
The __pthread_key_lock is defined as attribute_hidden in libc.so and when multiple
libc instances are loaded with dlmopen, each namespace has its own copy of the lock.
However, they all share the same key table in ldso via GL(dl_pthread_keys_data).
So concurrent pthread_key_create / pthread_key_delete calls from different namespaces
would race on the shared global bucket array. I think we will need something
analogous to GL(dl_stack_cache_lock) for the GL(dl_pthread_keys_data).
> {
> - uintptr_t seq = __pthread_keys[cnt].seq;
> + GL(dl_pthread_keys_data) = __pthread_key_buckets_local;
> + buckets = GL(dl_pthread_keys_data);
> + }
>
> - if (KEY_UNUSED (seq) && KEY_USABLE (seq)
> - /* We found an unused slot. Try to allocate it. */
> - && ! atomic_compare_and_exchange_bool_acq (&__pthread_keys[cnt].seq,
> - seq + 1, seq))
> + for (b = 0; b < 32; b ++)
> + {
> + struct pthread_key_bucket *bucket = buckets[b];
> +
> + if (bucket == NULL)
> {
> - /* Remember the destructor. */
> - __pthread_keys[cnt].destr = destr;
> + if (b == 0)
> + *((int *)0) = 0;
> + else
> + {
> + void *v;
> + size_t mlen = bucket2mlen (b);
> + v = __mmap (NULL, mlen, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
> +
> + if (v == MAP_FAILED)
> + {
> + PTHREAD_KEY_UNLOCK;
> + return ENOMEM; /* not in spec, but EAGAIN is the only other option. */
> + }
> + if (v == NULL)
> + *((int *)0) = 0;
>
> - /* Return the key to the caller. */
> - *key = cnt;
> + bucket = (struct pthread_key_bucket *) v;
> + buckets[b] = bucket;
> + int num_slots = bucket2slots (b);
> + bucket->num_slots = num_slots;
> + bucket->next_free_slot_hint = 0;
> + for (i=0; i<bucket->num_slots; i++)
> + bucket->keys[i] = PTHREAD_KEY_SLOT_FREE;
>
> - /* The call succeeded. */
> - return 0;
> + }
> }
> +
> + /* Initialize all values to NULL. In the old code, this was
> + optimized by storing a sequence with each destr and value to
> + detect "stale" values. We could optimize this, but the easy
> + option takes less memory. */
> + int f = bucket->next_free_slot_hint;
> + int s = bucket->num_slots;
> + for (i = 0; i < bucket->num_slots; i ++)
> + if (bucket->keys[(i+f)%s] == PTHREAD_KEY_SLOT_FREE)
> + {
> + int idx = (i+f)%s;
> + struct pthread_key_bucket *sb;
> +
> + bucket->keys[idx] = (void *) destr;
> + bucket->next_free_slot_hint = (idx+1)%s;
> + *key = PTHREAD_KEY_ENCODE (b, idx);
> +
You need to lock and unlock GL(dl_stack_cache_lock) before operate on the
GL(dl_stack_used), similar on how new stack are added on nptl/allocatestack.c.
> + list_for_each (runp, &GL (dl_stack_used))
> + {
> + struct pthread *t = list_entry (runp, struct pthread, list);
> + sb = THREAD_GETMEM_NC (t, pthread_key_buckets, b);
> + if (sb)
> + atomic_store_relaxed (& sb->keys[idx], NULL);
And I think atomic operation if the idea is to have proper mutual exclusion.
between on GL(dl_stack_used) and GL(dl_pthread_keys_data).
> + }
> +
> + list_for_each (runp, &GL (dl_stack_user))
> + {
> + struct pthread *t = list_entry (runp, struct pthread, list);
> + sb = THREAD_GETMEM_NC (t, pthread_key_buckets, b);
> + if (sb)
> + atomic_store_relaxed (& sb->keys[idx], NULL);
> + }
> +
> + PTHREAD_KEY_UNLOCK;
> + return 0;
> + }
> }
>
> + PTHREAD_KEY_UNLOCK;
> return EAGAIN;
> }
> +
> versioned_symbol (libc, ___pthread_key_create, __pthread_key_create,
> GLIBC_2_34);
> libc_hidden_ver (___pthread_key_create, __pthread_key_create)
> diff --git a/nptl/pthread_key_delete.c b/nptl/pthread_key_delete.c
> index 278e523ab1..0ebac121eb 100644
> --- a/nptl/pthread_key_delete.c
> +++ b/nptl/pthread_key_delete.c
> @@ -16,27 +16,47 @@
> <https://www.gnu.org/licenses/>. */
>
> #include <errno.h>
> -#include "pthreadP.h"
> #include <atomic.h>
> #include <shlib-compat.h>
>
> +#include "ldsodefs.h"
> +#include "pthreadP.h"
> +
> +extern int ___ldso_pthread_key_delete (pthread_key_t key)
> + weak_function;
> +
> int
> ___pthread_key_delete (pthread_key_t key)
> {
> - int result = EINVAL;
> + int b = PTHREAD_KEY_DECODE_BUCKET(key);
> + int i = PTHREAD_KEY_DECODE_SLOT(key);
> + struct pthread_key_bucket **buckets = GL(dl_pthread_keys_data);
> +
> + PTHREAD_KEY_LOCK;
>
> - if (__glibc_likely (key < PTHREAD_KEYS_MAX))
> + if (buckets[b] == NULL)
> {
> - unsigned int seq = __pthread_keys[key].seq;
> + PTHREAD_KEY_UNLOCK;
> + return EAGAIN;
> + }
Not sure if POSIX allows return EGAIN here, I think it should be EINVAL.
>
> - if (__builtin_expect (! KEY_UNUSED (seq), 1)
> - && ! atomic_compare_and_exchange_bool_acq (&__pthread_keys[key].seq,
> - seq + 1, seq))
> - /* We deleted a valid key. */
> - result = 0;
> + struct pthread_key_bucket *bucket = buckets[b];
> + if (i < 0 || i >= bucket->num_slots)
> + {
> + PTHREAD_KEY_UNLOCK;
> + return EAGAIN;
> + }
> +
> + if (bucket->keys[i] == PTHREAD_KEY_SLOT_FREE)
> + {
> + PTHREAD_KEY_UNLOCK;
> + return EAGAIN;
> }
>
> - return result;
> + bucket->keys[i] = PTHREAD_KEY_SLOT_FREE;
> + bucket->next_free_slot_hint = i;
> + PTHREAD_KEY_UNLOCK;
> + return 0;
> }
> versioned_symbol (libc, ___pthread_key_delete, pthread_key_delete,
> GLIBC_2_34);
> diff --git a/nptl/pthread_keys.c b/nptl/pthread_keys.c
> deleted file mode 100644
> index a98479e144..0000000000
> --- a/nptl/pthread_keys.c
> +++ /dev/null
> @@ -1,23 +0,0 @@
> -/* Table of pthread_key_create keys and their destructors.
> - Copyright (C) 2004-2026 Free Software Foundation, Inc.
> - This file is part of the GNU C Library.
> -
> - The GNU C Library is free software; you can redistribute it and/or
> - modify it under the terms of the GNU Lesser General Public
> - License as published by the Free Software Foundation; either
> - version 2.1 of the License, or (at your option) any later version.
> -
> - The GNU C Library is distributed in the hope that it will be useful,
> - but WITHOUT ANY WARRANTY; without even the implied warranty of
> - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> - Lesser General Public License for more details.
> -
> - You should have received a copy of the GNU Lesser General Public
> - License along with the GNU C Library; if not, see
> - <https://www.gnu.org/licenses/>. */
> -
> -#include <pthreadP.h>
> -
> -/* Table of the key information. */
> -struct pthread_key_struct __pthread_keys[PTHREAD_KEYS_MAX];
> -libc_hidden_data_def (__pthread_keys)
> diff --git a/nptl/pthread_setspecific.c b/nptl/pthread_setspecific.c
> index 3cf07a020c..d7fbd6eca2 100644
> --- a/nptl/pthread_setspecific.c
> +++ b/nptl/pthread_setspecific.c
> @@ -17,75 +17,75 @@
>
> #include <errno.h>
> #include <stdlib.h>
> -#include "pthreadP.h"
> #include <shlib-compat.h>
>
> +#include "ldsodefs.h"
> +#include "pthreadP.h"
> +
> +static size_t page_size = 0;
> +#define bucket2mlen(b) (((page_size != 0) ? page_size : get_page_size()) << (b-1))
> +#define bucket2slots(b) ((bucket2mlen(b) - 2 * sizeof(size_t)) / sizeof (void *))
> +
> +static size_t
> +get_page_size(void)
> +{
> + page_size = EXEC_PAGESIZE;
> + return page_size;
> +}
> +
> +
> int
> ___pthread_setspecific (pthread_key_t key, const void *value)
> {
> - struct pthread *self;
> - unsigned int idx1st;
> - unsigned int idx2nd;
> - struct pthread_key_data *level2;
> - unsigned int seq;
> + int b = PTHREAD_KEY_DECODE_BUCKET(key);
> + int i = PTHREAD_KEY_DECODE_SLOT(key);
>
> - self = THREAD_SELF;
> + /* We must check if the key is valid. */
>
> - /* Special case access to the first 2nd-level block. This is the
> - usual case. */
> - if (__glibc_likely (key < PTHREAD_KEY_2NDLEVEL_SIZE))
> - {
> - /* Verify the key is sane. */
> - if (KEY_UNUSED ((seq = __pthread_keys[key].seq)))
> - /* Not valid. */
> - return EINVAL;
> + /* The bucket number can only be 0..31, and all are valid. */
> + struct pthread_key_bucket **buckets = GL(dl_pthread_keys_data);
> + struct pthread_key_bucket *bucket = buckets[b];
> + if (bucket == NULL)
> + return EINVAL;
> + if (i < 0 || i >= bucket->num_slots)
> + return EINVAL;
> + if (bucket->keys[i] == PTHREAD_KEY_SLOT_FREE)
> + return EINVAL;
>
> - level2 = &self->specific_1stblock[key];
> + /* If we need to mmap a bucket, do so now. */
>
> - /* Remember that we stored at least one set of data. */
> - if (value != NULL)
> - THREAD_SETMEM (self, specific_used, true);
> - }
> - else
> - {
> - if (key >= PTHREAD_KEYS_MAX
> - || KEY_UNUSED ((seq = __pthread_keys[key].seq)))
> - /* Not valid. */
> - return EINVAL;
> + buckets = THREAD_SELF -> pthread_key_buckets;
> + bucket = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, b);
>
> - idx1st = key / PTHREAD_KEY_2NDLEVEL_SIZE;
> - idx2nd = key % PTHREAD_KEY_2NDLEVEL_SIZE;
> + if (bucket == NULL)
> + {
> + size_t mlen = bucket2mlen (b);
> + size_t slots = bucket2slots (b);
> + struct pthread_key_bucket *m;
>
> - /* This is the second level array. Allocate it if necessary. */
> - level2 = THREAD_GETMEM_NC (self, specific, idx1st);
> - if (level2 == NULL)
> + if (b == 0)
> {
> - if (value == NULL)
> - /* We don't have to do anything. The value would in any case
> - be NULL. We can save the memory allocation. */
> - return 0;
> -
> - level2
> - = (struct pthread_key_data *) calloc (PTHREAD_KEY_2NDLEVEL_SIZE,
> - sizeof (*level2));
> - if (level2 == NULL)
> + m = (struct pthread_key_bucket *) & THREAD_SELF->specific_1stblock_dj;
> + slots = PTHREAD_KEY_ZEROBLOCK_COUNT;
> + }
> + else
> + {
> + m = mmap (0, mlen, PROT_READ|PROT_WRITE,
> + MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
> + if (m == MAP_FAILED)
> return ENOMEM;
> -
> - THREAD_SETMEM_NC (self, specific, idx1st, level2);
> }
>
> - /* Pointer to the right array element. */
> - level2 = &level2[idx2nd];
> + m->num_slots = slots;
> + m->next_free_slot_hint = 0;
> + memset (m->keys, 0, slots * sizeof (void *));
>
> - /* Remember that we stored at least one set of data. */
> - THREAD_SETMEM (self, specific_used, true);
> + bucket = buckets[b] = m;
> + THREAD_SETMEM_NC (THREAD_SELF, pthread_key_buckets, b, m);
> }
>
> - /* Store the data and the sequence number so that we can recognize
> - stale data. */
> - level2->seq = seq;
> - level2->data = (void *) value;
> -
> + THREAD_SETMEM (THREAD_SELF, specific_used, 1);
> + bucket->keys[i] = (void *) value;
> return 0;
> }
> versioned_symbol (libc, ___pthread_setspecific, pthread_setspecific,
> diff --git a/nptl/tst-pthread-keys-ns.c b/nptl/tst-pthread-keys-ns.c
> new file mode 100644
> index 0000000000..336e161311
> --- /dev/null
> +++ b/nptl/tst-pthread-keys-ns.c
> @@ -0,0 +1,70 @@
> +/* Verify that keys are in sync across namespaces.
> + Copyright (C) 2026 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <https://www.gnu.org/licenses/>. */
> +
> +#include <support/check.h>
> +#include <support/xdlfcn.h>
> +#include <stdio.h>
> +#include <stdint.h>
> +#include <stdbool.h>
> +#include <pthread.h>
> +
> +/* This file loads ns1.so in a separate DSO namespace, and uses it to
> + create keys in that namespace, and compares to keys created in this
> + namespace. */
> +
> +#define NUM_KEYS 30
> +static pthread_key_t our_keys[NUM_KEYS];
> +static pthread_key_t ns_keys[NUM_KEYS];
> +
> +static int (*ns_pthread_key_create)(pthread_key_t *key,
> + void (*__destr_function) (void *));
> +
> +static int
> +do_test (void)
> +{
> + int i, j, errors=0;
> + void *so;
> +
> + so = xdlmopen (LM_ID_NEWLM, "tst-pthread-keys-ns1.so", RTLD_NOW);
> + ns_pthread_key_create = dlsym (so, "ns_pthread_key_create");
> +
> + for (i=0; i<NUM_KEYS; i++)
> + {
> + TEST_VERIFY (pthread_key_create (&our_keys[i], NULL) == 0);
> + TEST_VERIFY (ns_pthread_key_create (&ns_keys[i], NULL) == 0);
> +
> + printf(" %08x %08x\n", our_keys[i], ns_keys[i]);
> + }
> +
> + for (i=0; i<NUM_KEYS; i++)
> + for (j=0; j<NUM_KEYS; j++)
> + {
> + if (our_keys[i] == ns_keys[j])
> + {
> + if (errors < 5)
> + printf("collision %x[%d] %x[%d]\n", our_keys[i], i, ns_keys[j], j);
> + errors ++;
> + }
> + }
> +
> + xdlclose (so);
> +
> + return errors;
> +}
> +
> +#include <support/test-driver.c>
> diff --git a/nptl/tst-pthread-keys-ns1.c b/nptl/tst-pthread-keys-ns1.c
> new file mode 100644
> index 0000000000..6041920b34
> --- /dev/null
> +++ b/nptl/tst-pthread-keys-ns1.c
> @@ -0,0 +1,29 @@
> +/* Verify that keys are in sync across namespaces.
> + Copyright (C) 2026 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <https://www.gnu.org/licenses/>. */
> +
> +#include <pthread.h>
> +#include <support/check.h>
> +
> +/* This file just connects the test case to a glibc in a different DSO
> + namespace. */
> +
> +int ns_pthread_key_create (pthread_key_t *key,
> + void (*__destr_function) (void *))
> +{
> + return pthread_key_create (key, __destr_function);
> +}
> diff --git a/nptl_db/structs.def b/nptl_db/structs.def
> index 78cf2a49a4..1987f0b549 100644
> --- a/nptl_db/structs.def
> +++ b/nptl_db/structs.def
> @@ -56,7 +56,6 @@ DB_STRUCT_FIELD (pthread, start_routine)
> DB_STRUCT_FIELD (pthread, cancelhandling)
> DB_STRUCT_FIELD (pthread, schedpolicy)
> DB_STRUCT_FIELD (pthread, schedparam_sched_priority)
> -DB_STRUCT_FIELD (pthread, specific)
> DB_STRUCT_FIELD (pthread, eventbuf)
> DB_STRUCT_FIELD (pthread, eventbuf_eventmask)
> DB_STRUCT_ARRAY_FIELD (pthread, eventbuf_eventmask_event_bits)
> @@ -81,17 +80,10 @@ DB_VARIABLE (__nptl_nthreads)
> DB_VARIABLE (__nptl_last_event)
> DB_MAIN_VARIABLE (__nptl_initial_report_events)
>
> -DB_ARRAY_VARIABLE (__pthread_keys)
> DB_STRUCT (pthread_key_struct)
> DB_STRUCT_FIELD (pthread_key_struct, seq)
> DB_STRUCT_FIELD (pthread_key_struct, destr)
>
> -DB_STRUCT (pthread_key_data)
> -DB_STRUCT_FIELD (pthread_key_data, seq)
> -DB_STRUCT_FIELD (pthread_key_data, data)
> -DB_STRUCT (pthread_key_data_level2)
> -DB_STRUCT_ARRAY_FIELD (pthread_key_data_level2, data)
> -
> DB_STRUCT_FIELD (link_map, l_tls_modid)
> DB_STRUCT_FIELD (link_map, l_tls_offset)
>
> diff --git a/nptl_db/td_ta_tsd_iter.c b/nptl_db/td_ta_tsd_iter.c
> index 19f8c553fb..34c458fbee 100644
> --- a/nptl_db/td_ta_tsd_iter.c
> +++ b/nptl_db/td_ta_tsd_iter.c
> @@ -24,6 +24,9 @@ td_err_e
> td_ta_tsd_iter (const td_thragent_t *ta_arg, td_key_iter_f *callback,
> void *cbdata_p)
> {
> +#if 1
> + return TD_ERR;
> +#else
> td_thragent_t *const ta = (td_thragent_t *) ta_arg;
> td_err_e err;
> void *keys;
> @@ -77,4 +80,5 @@ td_ta_tsd_iter (const td_thragent_t *ta_arg, td_key_iter_f *callback,
> }
>
> return TD_OK;
> +#endif
> }
> diff --git a/nptl_db/td_thr_get_info.c b/nptl_db/td_thr_get_info.c
> index bddd6f46de..7f512bae2e 100644
> --- a/nptl_db/td_thr_get_info.c
> +++ b/nptl_db/td_thr_get_info.c
> @@ -59,11 +59,6 @@ td_thr_get_info (const td_thrhandle_t *th, td_thrinfo_t *infop)
> if (err != TD_OK)
> return err;
>
> - err = DB_GET_FIELD_ADDRESS (tls, th->th_ta_p, th->th_unique,
> - pthread, specific, 0);
> - if (err != TD_OK)
> - return err;
> -
> err = DB_GET_FIELD_LOCAL (schedpolicy, th->th_ta_p, copy, pthread,
> schedpolicy, 0);
> if (err != TD_OK)
> diff --git a/nptl_db/td_thr_tsd.c b/nptl_db/td_thr_tsd.c
> index 4c207e2a7d..8f94f30853 100644
> --- a/nptl_db/td_thr_tsd.c
> +++ b/nptl_db/td_thr_tsd.c
> @@ -18,11 +18,17 @@
>
> #include <stdint.h>
> #include "thread_dbP.h"
> +#include <stdio.h>
>
>
> td_err_e
> td_thr_tsd (const td_thrhandle_t *th, const thread_key_t tk, void **data)
> {
> +#if 1
> + /* As far as I can tell, this interface isn't used by anything, the
> + internals are not documented, and there's no testsuite. - DJ */
> + return TD_ERR;
> +#else
> td_err_e err;
> psaddr_t tk_seq, level1, level2, seq, value;
> void *copy;
> @@ -92,4 +98,5 @@ td_thr_tsd (const td_thrhandle_t *th, const thread_key_t tk, void **data)
> *data = value;
>
> return err;
> +#endif
> }
> diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
> index 15c4659853..2b69965921 100644
> --- a/sysdeps/generic/ldsodefs.h
> +++ b/sysdeps/generic/ldsodefs.h
> @@ -467,6 +467,10 @@ struct rtld_global
>
> /* Mutex protecting the stack lists. */
> EXTERN int _dl_stack_cache_lock;
> +
> + /* Pthread keys global destructor tables. Actually a pointer to
> + pthread_key_bucket[32]. */
> + EXTERN void *_dl_pthread_keys_data;
> #endif
> #if __PTHREAD_HTL
> /* The total number of thread IDs currently in use, or on the list of
> @@ -1455,6 +1459,7 @@ __rtld_mutex_init (void)
> /* The initialization happens later (!__PHREAD_NPTL) or is not
> needed at all (!SHARED). */
> }
> +
> #endif /* !__PHREAD_NPTL */
>
> /* Implementation of GL (dl_libc_freeres). */
> diff --git a/sysdeps/nptl/dl-tls_init_tp.c b/sysdeps/nptl/dl-tls_init_tp.c
> index 72cc4087c9..e1c4317b5d 100644
> --- a/sysdeps/nptl/dl-tls_init_tp.c
> +++ b/sysdeps/nptl/dl-tls_init_tp.c
> @@ -74,7 +74,7 @@ __tls_init_tp (void)
>
> /* Early initialization of the TCB. */
> pd->tid = INTERNAL_SYSCALL_CALL (set_tid_address, &pd->joinstate);
> - THREAD_SETMEM (pd, specific[0], &pd->specific_1stblock[0]);
> + _pthread_key_init (pd);
> THREAD_SETMEM (pd, stack_mode, ALLOCATE_GUARD_USER);
> THREAD_SETMEM (pd, joinstate, THREAD_STATE_JOINABLE);
>
> diff --git a/sysdeps/nptl/fork.h b/sysdeps/nptl/fork.h
> index c09e57c5ab..70e3bc05ca 100644
> --- a/sysdeps/nptl/fork.h
> +++ b/sysdeps/nptl/fork.h
> @@ -113,22 +113,22 @@ reclaim_stacks (void)
>
> if (curp->specific_used)
> {
> - /* Clear the thread-specific data. */
> - memset (curp->specific_1stblock, '\0',
> - sizeof (curp->specific_1stblock));
> -
> curp->specific_used = false;
>
> - for (size_t cnt = 1; cnt < PTHREAD_KEY_1STLEVEL_SIZE; ++cnt)
> - if (curp->specific[cnt] != NULL)
> - {
> - memset (curp->specific[cnt], '\0',
> - sizeof (curp->specific_1stblock));
> -
> - /* We have allocated the block which we do not
> - free here so re-set the bit. */
> - curp->specific_used = true;
> - }
> + for (size_t cnt = 0; cnt < PTHREAD_KEY_1STLEVEL_SIZE; ++cnt)
> + {
> + pthread_key_bucket *b = curp->pthread_key_buckets[cnt];
> + if (b != NULL)
> + {
> + memset (b->keys, '\0',
> + b->num_slots * sizeof (b->keys[0]));
> +
> + /* We have allocated the block which we do not
> + free here so re-set the bit. */
> + if (cnt > 0)
> + curp->specific_used = true;
> + }
> + }
> }
>
> call_function_static_weak (__getrandom_reset_state, curp);
> diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
> index de432d4032..55d85a1f20 100644
> --- a/sysdeps/nptl/pthreadP.h
> +++ b/sysdeps/nptl/pthreadP.h
> @@ -184,9 +184,29 @@ extern int __attr_list_lock attribute_hidden;
> /* Concurrency handling. */
> extern int __concurrency_level attribute_hidden;
>
> -/* Thread-local data key handling. */
> -extern struct pthread_key_struct __pthread_keys[PTHREAD_KEYS_MAX];
> -libc_hidden_proto (__pthread_keys)
> +/* Thread-local data key handling. Must match struct in
> + dl-pthread-keys.c. */
> +typedef struct pthread_key_bucket {
> + size_t num_slots;
> + size_t next_free_slot_hint; /* unused */
> + void *keys[1]; /* Not actually 1 but nptl_db needs something concrete. */
> +} pthread_key_bucket;
> +
> +#define PTHREAD_KEY_ENCODE(b,i) ((pthread_key_t) (b | (i<<5)))
> +#define PTHREAD_KEY_DECODE_BUCKET(k) (((size_t)k) & 31)
> +#define PTHREAD_KEY_DECODE_SLOT(k) (((size_t)k) >> 5)
> +#define PTHREAD_KEY_SLOT_FREE (void (*)(void *)) (-1UL)
> +#if IS_IN(rtld)
> +#define PTHREAD_KEY_LOCK
> +#define PTHREAD_KEY_UNLOCK
> +#else
> +__libc_lock_define_recursive (extern, __pthread_key_lock attribute_hidden);
> +#define PTHREAD_KEY_LOCK __libc_lock_lock_recursive (__pthread_key_lock)
> +#define PTHREAD_KEY_UNLOCK __libc_lock_unlock_recursive (__pthread_key_lock)
> +#endif
> +
> +extern void _pthread_key_init (struct pthread *);
> +libc_hidden_proto (_pthread_key_init);
>
> /* Number of threads running. */
> extern unsigned int __nptl_nthreads;
> diff --git a/sysdeps/x86_64/nptl/tst-x86-64-tls-1.c b/sysdeps/x86_64/nptl/tst-x86-64-tls-1.c
> index 2a6eb2ebd6..4b707be0ad 100644
> --- a/sysdeps/x86_64/nptl/tst-x86-64-tls-1.c
> +++ b/sysdeps/x86_64/nptl/tst-x86-64-tls-1.c
> @@ -38,25 +38,25 @@ do_test (void)
>
> THREAD_SETMEM (THREAD_SELF, header.ssp_base, saved_ssp_base);
> #ifndef __ILP32__
> - struct pthread_key_data *saved_specific, *specific;
> - saved_specific = THREAD_GETMEM_NC (THREAD_SELF, specific, 1);
> + struct pthread_key_bucket *saved_pthread_key_buckets, *pthread_key_buckets;
> + saved_pthread_key_buckets = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, 1);
>
> uintptr_t value = (1UL << 57) - 1;
> - THREAD_SETMEM_NC (THREAD_SELF, specific, 1,
> - (struct pthread_key_data *) value);
> - specific = THREAD_GETMEM_NC (THREAD_SELF, specific, 1);
> - if (specific != (struct pthread_key_data *) value)
> + THREAD_SETMEM_NC (THREAD_SELF, pthread_key_buckets, 1,
> + (struct pthread_key_bucket *) value);
> + pthread_key_buckets = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, 1);
> + if (pthread_key_buckets != (struct pthread_key_bucket *) value)
> FAIL_EXIT1 ("THREAD_GETMEM_NC: %p != %p",
> - specific, (struct pthread_key_data *) value);
> + pthread_key_buckets, (struct pthread_key_bucket *) value);
>
> - THREAD_SETMEM_NC (THREAD_SELF, specific, 1,
> - (struct pthread_key_data *) -1UL);
> - specific = THREAD_GETMEM_NC (THREAD_SELF, specific, 1);
> - if (specific != (struct pthread_key_data *) -1UL)
> + THREAD_SETMEM_NC (THREAD_SELF, pthread_key_buckets, 1,
> + (struct pthread_key_bucket *) -1UL);
> + pthread_key_buckets = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, 1);
> + if (pthread_key_buckets != (struct pthread_key_bucket *) -1UL)
> FAIL_EXIT1 ("THREAD_GETMEM_NC: %p != %p",
> - specific, (struct pthread_key_data *) -1UL);
> + pthread_key_buckets, (struct pthread_key_bucket *) -1UL);
>
> - THREAD_SETMEM_NC (THREAD_SELF, specific, 1, saved_specific);
> + THREAD_SETMEM_NC (THREAD_SELF, pthread_key_buckets, 1, saved_pthread_key_buckets);
> #endif
> return 0;
> }
>
Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> writes:
> The old design stored a generation sequence number in both the global slot and
> the per-thread slot (KEY_UNUSED and KEY_USABLE). When a key was deleted and its
> slot reused, per-thread values from the old key failed the sequence-number check
> in pthread_getspecific and deallocate_tsd, and were silently ignored.
Were the ramifications of this memory leak fully understood?
>> + for (i = 0; i <slots; i ++)
>> + {
>> + void (*d)(void *) = atomic_load_relaxed (&destr_buckets[b]->keys[i]);
>> + if (d != PTHREAD_KEY_SLOT_FREE && d != NULL
>> + && bucket->keys[i] != NULL)
>
> Wouldn't this skip keys created with dest == NULL? I think
> get_cached_stack might reuses it for a new thread without calling
> _pthread_key_init.
We can't call a NULL destructor. The only side effect would be that a
key with a NULL destructor doesn't get cleared, but that also doesn't
set specific_used, so the loop would ignore it a fixed number of times
before the non-NULL-destructor keys are all destructed. Then we unmap
the memory, so all trace of the non-cleared keys vanishes.
Unless the destructor for one key calls setspecific for a different
key... but the destructor is only called on thread exit, so why would
it?
>> int
>> ___pthread_key_create (pthread_key_t *key, void (*destr) (void *))
>> {
>> - /* Find a slot in __pthread_keys which is unused. */
>> - for (size_t cnt = 0; cnt < PTHREAD_KEYS_MAX; ++cnt)
>> + int b, i;
>> + list_t *runp;
>> +
>> + PTHREAD_KEY_LOCK;
>> + struct pthread_key_bucket **buckets = GL(dl_pthread_keys_data);
>> + if (buckets == NULL)
>
> I think this does not address the namespace issue you are trying to
> fix. The __pthread_key_lock is defined as attribute_hidden in libc.so
> and when multiple libc instances are loaded with dlmopen, each
> namespace has its own copy of the lock. However, they all share the
> same key table in ldso via GL(dl_pthread_keys_data).
Right, move lock to ld.so too...
>> - if (__glibc_likely (key < PTHREAD_KEYS_MAX))
>> + if (buckets[b] == NULL)
>> {
>> - unsigned int seq = __pthread_keys[key].seq;
>> + PTHREAD_KEY_UNLOCK;
>> + return EAGAIN;
>> + }
>
> Not sure if POSIX allows return EGAIN here, I think it should be EINVAL.
1003.1-2024 allows but doesn't list explicit errors, other than to
forbid EINTR. I could switch them all to EINVAL though, but do we want
some way to tell the difference between a key which *could* be valid (in
bounds but not created), vs a key which *couldn't* be valid (out of
bounds) ?
https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/functions/pthread_key_delete.html
On 12/05/26 22:00, DJ Delorie wrote:
>
> Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> writes:
>> The old design stored a generation sequence number in both the global slot and
>> the per-thread slot (KEY_UNUSED and KEY_USABLE). When a key was deleted and its
>> slot reused, per-thread values from the old key failed the sequence-number check
>> in pthread_getspecific and deallocate_tsd, and were silently ignored.
>
> Were the ramifications of this memory leak fully understood?
But this new scheme adds a different race scenario:
1. Thread 1 issues setspecific(K, v).
2. Thread 2 issues pthread_key_delete(K) then pthread_key_create(&K') reusing
the same slot/bucket, registering destructor destr'.
3. T1 exits, deallocate_tsd sees destr' at the slot and T1's per-thread keys[i] == v,
calling destr'(v) — wrong destructor + wrong data.
The mitigating walk in *_create clears per-thread slots in threads on
dl_stack_used/dl_stack_user, but only at the *create* moment; a thread that does
pthread_setspecific between the delete and the create (legal? technically undefined
per POSIX) is fine, but any in-flight value escaping the walk is mis-destroyed.
So current code leaks (POSIX-permitted, deterministic), while this proposal adds
an undefined behavior on stale data (calls wrong destructor with wrong-type pointer).
I am not sure if this is an improvement.
Maybe a better alternative would be to split the patch in two, one that fixes the
*namespace* share issue (move __pthread_keys[PTHREAD_KEYS_MAX] from libc into
GL(dl_pthread_keys), nptl_db keeps working with just a symbol-relocation fix),
and then *another* to fix the scalability issue.
I am not sure if we can get away from the sequence number, not without adding
potential other race issues. Maybe we can a different sentinel value, where
pthread_key_delete sets stating that this bucket is being freed, but deallocate_tsd
is not being called yet (so it will be moved to the free-slots only when the thread
finishes). I need to think more about it.
Another potential issue which I just realized is destructors can call setspecific,
so it would be good to check the new scheme with this scenario.
>
>>> + for (i = 0; i <slots; i ++)
>>> + {
>>> + void (*d)(void *) = atomic_load_relaxed (&destr_buckets[b]->keys[i]);
>>> + if (d != PTHREAD_KEY_SLOT_FREE && d != NULL
>>> + && bucket->keys[i] != NULL)
>>
>> Wouldn't this skip keys created with dest == NULL? I think
>> get_cached_stack might reuses it for a new thread without calling
>> _pthread_key_init.
>
> We can't call a NULL destructor. The only side effect would be that a
> key with a NULL destructor doesn't get cleared, but that also doesn't
> set specific_used, so the loop would ignore it a fixed number of times
> before the non-NULL-destructor keys are all destructed. Then we unmap
> the memory, so all trace of the non-cleared keys vanishes.
>
> Unless the destructor for one key calls setspecific for a different
> key... but the destructor is only called on thread exit, so why would
> it?
Does every reuse goes through allocate_stack which now unconditionally calls
_pthread_key_init, even for stack from the cache (get_cached_stack)?
>
>>> int
>>> ___pthread_key_create (pthread_key_t *key, void (*destr) (void *))
>>> {
>>> - /* Find a slot in __pthread_keys which is unused. */
>>> - for (size_t cnt = 0; cnt < PTHREAD_KEYS_MAX; ++cnt)
>>> + int b, i;
>>> + list_t *runp;
>>> +
>>> + PTHREAD_KEY_LOCK;
>>> + struct pthread_key_bucket **buckets = GL(dl_pthread_keys_data);
>>> + if (buckets == NULL)
>>
>> I think this does not address the namespace issue you are trying to
>> fix. The __pthread_key_lock is defined as attribute_hidden in libc.so
>> and when multiple libc instances are loaded with dlmopen, each
>> namespace has its own copy of the lock. However, they all share the
>> same key table in ldso via GL(dl_pthread_keys_data).
>
> Right, move lock to ld.so too...
>
>>> - if (__glibc_likely (key < PTHREAD_KEYS_MAX))
>>> + if (buckets[b] == NULL)
>>> {
>>> - unsigned int seq = __pthread_keys[key].seq;
>>> + PTHREAD_KEY_UNLOCK;
>>> + return EAGAIN;
>>> + }
>>
>> Not sure if POSIX allows return EGAIN here, I think it should be EINVAL.
>
> 1003.1-2024 allows but doesn't list explicit errors, other than to
> forbid EINTR. I could switch them all to EINVAL though, but do we want
> some way to tell the difference between a key which *could* be valid (in
> bounds but not created), vs a key which *couldn't* be valid (out of
> bounds) ?
But EAGAIN means "transient, retry may succeed.", this is not usually the
expected semantic for pthread_key_delete (retry until is works).
>
> https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/functions/pthread_key_delete.html
>
@@ -74,6 +74,7 @@ dl-routines = \
dl-open \
dl-origin \
dl-printf \
+ dl-pthread-keys \
dl-readonly-area \
dl-reloc \
dl-runtime \
new file mode 100644
@@ -0,0 +1,21 @@
+#include <unistd.h>
+#include <ldsodefs.h>
+#include <list.h>
+#include <libc-lock.h>
+#include "pthreadP.h"
+
+/* This struct must match the global definition in pthreadP.h so that
+ the slot counts match. */
+static struct {
+ size_t num_slots;
+ size_t next_free_slot_hint;
+ void (*destrs[PTHREAD_KEY_ZEROBLOCK_COUNT])(void *);
+} key_bucket_zero =
+ {
+ PTHREAD_KEY_ZEROBLOCK_COUNT, 0,
+ { [0 ... PTHREAD_KEY_ZEROBLOCK_COUNT-1] = PTHREAD_KEY_SLOT_FREE }
+ };
+
+/* Table of the key information. */
+struct pthread_key_bucket *_dl_pthread_key_buckets_ldso[32] =
+ { (struct pthread_key_bucket *)&key_bucket_zero };
@@ -176,6 +176,9 @@ list_t _dl_stack_cache;
size_t _dl_stack_cache_actsize;
uintptr_t _dl_in_flight_stack;
int _dl_stack_cache_lock;
+
+extern struct pthread_key_bucket *_dl_pthread_key_buckets_ldso[32];
+void *_dl_pthread_keys_data = _dl_pthread_key_buckets_ldso;
#endif
struct dl_scope_free_list *_dl_scope_free_list;
@@ -729,6 +729,8 @@ match_version (const char *string, struct link_map *map)
bool __rtld_tls_init_tp_called;
+extern struct pthread_key_bucket *_dl_pthread_key_buckets_ldso[32];
+
static void *
init_tls (size_t naudit)
{
@@ -775,6 +777,8 @@ cannot allocate TLS data structures for initial thread\n");
so it knows not to pass this dtv to the normal realloc. */
GL(dl_initial_dtv) = GET_DTV (tcbp);
+ GL(dl_pthread_keys_data) = _dl_pthread_key_buckets_ldso;
+
/* And finally install it for the main thread. */
call_tls_init_tp (tcbp);
__rtld_tls_init_tp_called = true;
@@ -123,7 +123,6 @@ routines = \
pthread_join_common \
pthread_key_create \
pthread_key_delete \
- pthread_keys \
pthread_kill \
pthread_kill_other_threads \
pthread_mutex_cond_lock \
@@ -321,6 +320,7 @@ tests = \
tst-pthread-gdb-attach-static \
tst-pthread-getcpuclockid-invalid \
tst-pthread-key1-static \
+ tst-pthread-keys-ns \
tst-pthread-timedlock-lockloop \
tst-pthread_exit-nothreads \
tst-pthread_exit-nothreads-static \
@@ -485,6 +485,7 @@ modules-names = \
tst-audit-threads-mod1 \
tst-audit-threads-mod2 \
tst-compat-forwarder-mod \
+ tst-pthread-keys-ns1 \
tst-stack4mod \
tst-tls-debug-mod \
tst-tls3mod \
@@ -687,6 +688,8 @@ $(objpfx)tst-tls6.out: tst-tls6.sh $(objpfx)tst-tls5 \
$(evaluate-test)
endif
+$(objpfx)tst-pthread-keys-ns.out: $(objpfx)tst-pthread-keys-ns1.so
+
LDLIBS-tst-cancel24 = -Wl,--no-as-needed -lstdc++
LDLIBS-tst-cancel24-static = $(LDLIBS-tst-cancel24)
@@ -36,6 +36,7 @@
#include <intprops.h>
#include <setvmaname.h>
+
/* Default alignment of stack. */
#ifndef STACK_ALIGN
# define STACK_ALIGN __alignof__ (long double)
@@ -424,7 +425,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
memset (pd, '\0', sizeof (struct pthread));
/* The first TSD block is included in the TCB. */
- pd->specific[0] = pd->specific_1stblock;
+ _pthread_key_init (pd);
/* Remember the stack-related values. */
pd->stackblock = (char *) stackaddr - size;
@@ -548,7 +549,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
/* We allocated the first block thread-specific data array.
This address will not change for the lifetime of this
descriptor. */
- pd->specific[0] = pd->specific_1stblock;
+ _pthread_key_init (pd);
/* This is at least the second thread. */
pd->header.multiple_threads = 1;
@@ -57,9 +57,6 @@
((PTHREAD_KEYS_MAX + PTHREAD_KEY_2NDLEVEL_SIZE - 1) \
/ PTHREAD_KEY_2NDLEVEL_SIZE)
-
-
-
/* Internal version of the buffer to store cancellation handler
information. */
struct pthread_unwind_buf
@@ -320,6 +317,12 @@ struct pthread
/* We allocate one block of references here. This should be enough
to avoid allocating any memory dynamically for most applications. */
+ /* Note that the new implementation doesn't use the seq, and treats
+ the whole block as a large array of pointers, with the first slot
+ being the number of pointers. We avoid changing the declaration
+ here to avoid compatibility issues. */
+ /* Renamed so I could find all the references to them and fix them,
+ but left in place to keep the struct the same size. */
struct pthread_key_data
{
/* Sequence number. We use uintptr_t to not require padding on
@@ -329,10 +332,12 @@ struct pthread
/* Data pointer. */
void *data;
- } specific_1stblock[PTHREAD_KEY_2NDLEVEL_SIZE];
+ } specific_1stblock_dj[PTHREAD_KEY_2NDLEVEL_SIZE];
+/* Number of keys in block zero of new structure. */
+#define PTHREAD_KEY_ZEROBLOCK_COUNT (PTHREAD_KEY_2NDLEVEL_SIZE * 2 - 2)
/* Two-level array for the thread-specific data. */
- struct pthread_key_data *specific[PTHREAD_KEY_1STLEVEL_SIZE];
+ struct pthread_key_bucket *pthread_key_buckets[PTHREAD_KEY_1STLEVEL_SIZE];
/* Flag which is set when specific data is set. */
bool specific_used;
@@ -15,97 +15,78 @@
License along with the GNU C Library; if not, see
<https://www.gnu.org/licenses/>. */
+#include <ldsodefs.h>
#include <pthreadP.h>
-/* Deallocate POSIX thread-local-storage. */
+static size_t page_size = 0;
+#define bucket2mlen(b) (((page_size != 0) ? page_size : get_page_size()) << (b-1))
+#define bucket2slots(b) ((bucket2mlen(b) - 2 * sizeof(size_t)) / sizeof (void *))
+
+static size_t
+get_page_size(void)
+{
+ page_size = EXEC_PAGESIZE;
+ return page_size;
+}
+
+/* Called by each thread at its termination, to destruct its key
+ values. Atomics are not needed for thread-specific data, but locks
+ are needed on global data. */
void
__nptl_deallocate_tsd (void)
{
- struct pthread *self = THREAD_SELF;
+ struct pthread_key_bucket **destr_buckets;
+ struct pthread_key_bucket *bucket;
/* Maybe no data was ever allocated. This happens often so we have
a flag for this. */
- if (THREAD_GETMEM (self, specific_used))
+ if (THREAD_GETMEM (THREAD_SELF, specific_used))
{
- size_t round;
- size_t cnt;
-
- round = 0;
- do
- {
- size_t idx;
-
- /* So far no new nonzero data entry. */
- THREAD_SETMEM (self, specific_used, false);
-
- for (cnt = idx = 0; cnt < PTHREAD_KEY_1STLEVEL_SIZE; ++cnt)
- {
- struct pthread_key_data *level2;
-
- level2 = THREAD_GETMEM_NC (self, specific, cnt);
-
- if (level2 != NULL)
- {
- size_t inner;
+ int b, i, iters_left = 32;
+
+ destr_buckets = GL(dl_pthread_keys_data);
+
+ /* Destroy all current key values. */
+ while (--iters_left && THREAD_SELF->specific_used)
+ {
+ /* So far no new nonzero data entry. */
+ THREAD_SETMEM (THREAD_SELF, specific_used, false);
+ for (b = 0; b < 32; b ++)
+ {
+ bucket = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, b);
+
+ if (bucket != NULL)
+ {
+ int slots = bucket->num_slots;
+ for (i = 0; i <slots; i ++)
+ {
+ void (*d)(void *) = atomic_load_relaxed (&destr_buckets[b]->keys[i]);
+ if (d != PTHREAD_KEY_SLOT_FREE && d != NULL
+ && bucket->keys[i] != NULL)
+ {
+ const void *v = bucket->keys[i];
+ bucket->keys[i] = NULL;
+ d ((void *) v);
+ }
+ }
+ }
+ }
+ }
+
+ /* Unmap all mmap'd buckets. */
+ for (b = 1; b < 32; b ++)
+ {
+ bucket = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, b);
+ if (bucket != NULL)
+ {
+ size_t mlen = bucket2mlen (b);
+ __munmap (bucket, mlen);
+ THREAD_SETMEM_NC (THREAD_SELF, pthread_key_buckets, b, 0);
+ }
+ }
- for (inner = 0; inner < PTHREAD_KEY_2NDLEVEL_SIZE;
- ++inner, ++idx)
- {
- void *data = level2[inner].data;
-
- if (data != NULL)
- {
- /* Always clear the data. */
- level2[inner].data = NULL;
-
- /* Make sure the data corresponds to a valid
- key. This test fails if the key was
- deallocated and also if it was
- re-allocated. It is the user's
- responsibility to free the memory in this
- case. */
- if (level2[inner].seq
- == __pthread_keys[idx].seq
- /* It is not necessary to register a destructor
- function. */
- && __pthread_keys[idx].destr != NULL)
- /* Call the user-provided destructor. */
- __pthread_keys[idx].destr (data);
- }
- }
- }
- else
- idx += PTHREAD_KEY_1STLEVEL_SIZE;
- }
-
- if (THREAD_GETMEM (self, specific_used) == 0)
- /* No data has been modified. */
- goto just_free;
- }
- /* We only repeat the process a fixed number of times. */
- while (__builtin_expect (++round < PTHREAD_DESTRUCTOR_ITERATIONS, 0));
-
- /* Just clear the memory of the first block for reuse. */
- memset (&THREAD_SELF->specific_1stblock, '\0',
- sizeof (self->specific_1stblock));
-
- just_free:
- /* Free the memory for the other blocks. */
- for (cnt = 1; cnt < PTHREAD_KEY_1STLEVEL_SIZE; ++cnt)
- {
- struct pthread_key_data *level2;
-
- level2 = THREAD_GETMEM_NC (self, specific, cnt);
- if (level2 != NULL)
- {
- /* The first block is allocated as part of the thread
- descriptor. */
- free (level2);
- THREAD_SETMEM_NC (self, specific, cnt, NULL);
- }
- }
-
- THREAD_SETMEM (self, specific_used, false);
}
+ THREAD_SETMEM (THREAD_SELF, specific_used, false);
}
+
libc_hidden_def (__nptl_deallocate_tsd)
@@ -434,6 +434,7 @@ start_thread (void *arg)
{
/* Store the new cleanup handler info. */
THREAD_SETMEM (pd, cleanup_jmp_buf, &unwind_buf);
+ assert (pd == THREAD_SELF);
internal_signal_restore_set (&pd->sigmask);
@@ -16,51 +16,43 @@
<https://www.gnu.org/licenses/>. */
#include <stdlib.h>
-#include "pthreadP.h"
#include <shlib-compat.h>
+#include "pthreadP.h"
+#include "ldsodefs.h"
+
+extern void *___ldso_pthread_getspecific (pthread_key_t key)
+ weak_function;
+
void *
___pthread_getspecific (pthread_key_t key)
{
- struct pthread_key_data *data;
-
- /* Special case access to the first 2nd-level block. This is the
- usual case. */
- if (__glibc_likely (key < PTHREAD_KEY_2NDLEVEL_SIZE))
- data = &THREAD_SELF->specific_1stblock[key];
- else
- {
- /* Verify the key is sane. */
- if (key >= PTHREAD_KEYS_MAX)
- /* Not valid. */
- return NULL;
+ int b = PTHREAD_KEY_DECODE_BUCKET(key);
+ int i = PTHREAD_KEY_DECODE_SLOT(key);
+ struct pthread_key_bucket **buckets = GL(dl_pthread_keys_data);
- unsigned int idx1st = key / PTHREAD_KEY_2NDLEVEL_SIZE;
- unsigned int idx2nd = key % PTHREAD_KEY_2NDLEVEL_SIZE;
+ /* The spec doesn't allow for errors, so remove these checks after
+ development? */
+ if (buckets == NULL)
+ return NULL;
+ if (buckets[b] == NULL)
+ return NULL;
- /* If the sequence number doesn't match or the key cannot be defined
- for this thread since the second level array is not allocated
- return NULL, too. */
- struct pthread_key_data *level2 = THREAD_GETMEM_NC (THREAD_SELF,
- specific, idx1st);
- if (level2 == NULL)
- /* Not allocated, therefore no data. */
- return NULL;
+ struct pthread_key_bucket *bucket = buckets[b];
+ if (i < 0 || i >= bucket->num_slots)
+ return NULL;
- /* There is data. */
- data = &level2[idx2nd];
- }
+ if (bucket->keys[i] == PTHREAD_KEY_SLOT_FREE)
+ return NULL;
- void *result = data->data;
- if (result != NULL)
- {
- uintptr_t seq = data->seq;
+ bucket = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, b);
- if (__glibc_unlikely (seq != __pthread_keys[key].seq))
- result = data->data = NULL;
- }
+ /* key created before this thread, and the mapping is not yet
+ created. */
+ if (bucket == NULL)
+ return NULL;
- return result;
+ return bucket->keys[i];
}
versioned_symbol (libc, ___pthread_getspecific, pthread_getspecific,
GLIBC_2_34);
@@ -15,37 +15,152 @@
License along with the GNU C Library; if not, see
<https://www.gnu.org/licenses/>. */
+#include <assert.h>
#include <errno.h>
-#include "pthreadP.h"
+#include <list.h>
#include <atomic.h>
#include <shlib-compat.h>
+#include "ldsodefs.h"
+#include "pthreadP.h"
+
+_Static_assert (PTHREAD_KEY_1STLEVEL_SIZE >= 32, "incompatible TSD sizes");
+
+__libc_lock_define_initialized_recursive (, __pthread_key_lock)
+
+/* This struct must match the global definition in pthreadP.h so that
+ the slot counts match. */
+static struct {
+ size_t num_slots;
+ size_t next_free_slot_hint;
+ void (*destrs[PTHREAD_KEY_ZEROBLOCK_COUNT])(void *);
+} key_bucket_zero =
+ {
+ PTHREAD_KEY_ZEROBLOCK_COUNT, 0,
+ { [0 ... PTHREAD_KEY_ZEROBLOCK_COUNT-1] = PTHREAD_KEY_SLOT_FREE }
+ };
+
+/* Table of the key information. */
+static struct pthread_key_bucket *__pthread_key_buckets_local[32] =
+ { (struct pthread_key_bucket *)&key_bucket_zero };
+
+static size_t page_size = 0;
+#define bucket2mlen(b) (((page_size != 0) ? page_size : get_page_size()) << (b-1))
+#define bucket2slots(b) ((bucket2mlen(b) - 2 * sizeof(size_t)) / sizeof (void *))
+
+static size_t
+get_page_size(void)
+{
+ page_size = EXEC_PAGESIZE;
+ return page_size;
+}
+
+void
+_pthread_key_init (struct pthread *pd)
+{
+ /* Initialize global key data. */
+ if (GL(dl_pthread_keys_data) == NULL)
+ GL(dl_pthread_keys_data) = __pthread_key_buckets_local;
+
+ /* Initialize thread key data. */
+ struct pthread_key_bucket *bucket
+ = (struct pthread_key_bucket *) & (pd->specific_1stblock_dj);
+ bucket->num_slots = PTHREAD_KEY_ZEROBLOCK_COUNT;
+ bucket->next_free_slot_hint = 0;
+ memset (bucket->keys, 0, PTHREAD_KEY_ZEROBLOCK_COUNT * sizeof (void *));
+ pd->pthread_key_buckets[0] = bucket;
+}
+libc_hidden_def (_pthread_key_init)
+
+
int
___pthread_key_create (pthread_key_t *key, void (*destr) (void *))
{
- /* Find a slot in __pthread_keys which is unused. */
- for (size_t cnt = 0; cnt < PTHREAD_KEYS_MAX; ++cnt)
+ int b, i;
+ list_t *runp;
+
+ PTHREAD_KEY_LOCK;
+ struct pthread_key_bucket **buckets = GL(dl_pthread_keys_data);
+ if (buckets == NULL)
{
- uintptr_t seq = __pthread_keys[cnt].seq;
+ GL(dl_pthread_keys_data) = __pthread_key_buckets_local;
+ buckets = GL(dl_pthread_keys_data);
+ }
- if (KEY_UNUSED (seq) && KEY_USABLE (seq)
- /* We found an unused slot. Try to allocate it. */
- && ! atomic_compare_and_exchange_bool_acq (&__pthread_keys[cnt].seq,
- seq + 1, seq))
+ for (b = 0; b < 32; b ++)
+ {
+ struct pthread_key_bucket *bucket = buckets[b];
+
+ if (bucket == NULL)
{
- /* Remember the destructor. */
- __pthread_keys[cnt].destr = destr;
+ if (b == 0)
+ *((int *)0) = 0;
+ else
+ {
+ void *v;
+ size_t mlen = bucket2mlen (b);
+ v = __mmap (NULL, mlen, PROT_READ|PROT_WRITE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+
+ if (v == MAP_FAILED)
+ {
+ PTHREAD_KEY_UNLOCK;
+ return ENOMEM; /* not in spec, but EAGAIN is the only other option. */
+ }
+ if (v == NULL)
+ *((int *)0) = 0;
- /* Return the key to the caller. */
- *key = cnt;
+ bucket = (struct pthread_key_bucket *) v;
+ buckets[b] = bucket;
+ int num_slots = bucket2slots (b);
+ bucket->num_slots = num_slots;
+ bucket->next_free_slot_hint = 0;
+ for (i=0; i<bucket->num_slots; i++)
+ bucket->keys[i] = PTHREAD_KEY_SLOT_FREE;
- /* The call succeeded. */
- return 0;
+ }
}
+
+ /* Initialize all values to NULL. In the old code, this was
+ optimized by storing a sequence with each destr and value to
+ detect "stale" values. We could optimize this, but the easy
+ option takes less memory. */
+ int f = bucket->next_free_slot_hint;
+ int s = bucket->num_slots;
+ for (i = 0; i < bucket->num_slots; i ++)
+ if (bucket->keys[(i+f)%s] == PTHREAD_KEY_SLOT_FREE)
+ {
+ int idx = (i+f)%s;
+ struct pthread_key_bucket *sb;
+
+ bucket->keys[idx] = (void *) destr;
+ bucket->next_free_slot_hint = (idx+1)%s;
+ *key = PTHREAD_KEY_ENCODE (b, idx);
+
+ list_for_each (runp, &GL (dl_stack_used))
+ {
+ struct pthread *t = list_entry (runp, struct pthread, list);
+ sb = THREAD_GETMEM_NC (t, pthread_key_buckets, b);
+ if (sb)
+ atomic_store_relaxed (& sb->keys[idx], NULL);
+ }
+
+ list_for_each (runp, &GL (dl_stack_user))
+ {
+ struct pthread *t = list_entry (runp, struct pthread, list);
+ sb = THREAD_GETMEM_NC (t, pthread_key_buckets, b);
+ if (sb)
+ atomic_store_relaxed (& sb->keys[idx], NULL);
+ }
+
+ PTHREAD_KEY_UNLOCK;
+ return 0;
+ }
}
+ PTHREAD_KEY_UNLOCK;
return EAGAIN;
}
+
versioned_symbol (libc, ___pthread_key_create, __pthread_key_create,
GLIBC_2_34);
libc_hidden_ver (___pthread_key_create, __pthread_key_create)
@@ -16,27 +16,47 @@
<https://www.gnu.org/licenses/>. */
#include <errno.h>
-#include "pthreadP.h"
#include <atomic.h>
#include <shlib-compat.h>
+#include "ldsodefs.h"
+#include "pthreadP.h"
+
+extern int ___ldso_pthread_key_delete (pthread_key_t key)
+ weak_function;
+
int
___pthread_key_delete (pthread_key_t key)
{
- int result = EINVAL;
+ int b = PTHREAD_KEY_DECODE_BUCKET(key);
+ int i = PTHREAD_KEY_DECODE_SLOT(key);
+ struct pthread_key_bucket **buckets = GL(dl_pthread_keys_data);
+
+ PTHREAD_KEY_LOCK;
- if (__glibc_likely (key < PTHREAD_KEYS_MAX))
+ if (buckets[b] == NULL)
{
- unsigned int seq = __pthread_keys[key].seq;
+ PTHREAD_KEY_UNLOCK;
+ return EAGAIN;
+ }
- if (__builtin_expect (! KEY_UNUSED (seq), 1)
- && ! atomic_compare_and_exchange_bool_acq (&__pthread_keys[key].seq,
- seq + 1, seq))
- /* We deleted a valid key. */
- result = 0;
+ struct pthread_key_bucket *bucket = buckets[b];
+ if (i < 0 || i >= bucket->num_slots)
+ {
+ PTHREAD_KEY_UNLOCK;
+ return EAGAIN;
+ }
+
+ if (bucket->keys[i] == PTHREAD_KEY_SLOT_FREE)
+ {
+ PTHREAD_KEY_UNLOCK;
+ return EAGAIN;
}
- return result;
+ bucket->keys[i] = PTHREAD_KEY_SLOT_FREE;
+ bucket->next_free_slot_hint = i;
+ PTHREAD_KEY_UNLOCK;
+ return 0;
}
versioned_symbol (libc, ___pthread_key_delete, pthread_key_delete,
GLIBC_2_34);
deleted file mode 100644
@@ -1,23 +0,0 @@
-/* Table of pthread_key_create keys and their destructors.
- Copyright (C) 2004-2026 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <pthreadP.h>
-
-/* Table of the key information. */
-struct pthread_key_struct __pthread_keys[PTHREAD_KEYS_MAX];
-libc_hidden_data_def (__pthread_keys)
@@ -17,75 +17,75 @@
#include <errno.h>
#include <stdlib.h>
-#include "pthreadP.h"
#include <shlib-compat.h>
+#include "ldsodefs.h"
+#include "pthreadP.h"
+
+static size_t page_size = 0;
+#define bucket2mlen(b) (((page_size != 0) ? page_size : get_page_size()) << (b-1))
+#define bucket2slots(b) ((bucket2mlen(b) - 2 * sizeof(size_t)) / sizeof (void *))
+
+static size_t
+get_page_size(void)
+{
+ page_size = EXEC_PAGESIZE;
+ return page_size;
+}
+
+
int
___pthread_setspecific (pthread_key_t key, const void *value)
{
- struct pthread *self;
- unsigned int idx1st;
- unsigned int idx2nd;
- struct pthread_key_data *level2;
- unsigned int seq;
+ int b = PTHREAD_KEY_DECODE_BUCKET(key);
+ int i = PTHREAD_KEY_DECODE_SLOT(key);
- self = THREAD_SELF;
+ /* We must check if the key is valid. */
- /* Special case access to the first 2nd-level block. This is the
- usual case. */
- if (__glibc_likely (key < PTHREAD_KEY_2NDLEVEL_SIZE))
- {
- /* Verify the key is sane. */
- if (KEY_UNUSED ((seq = __pthread_keys[key].seq)))
- /* Not valid. */
- return EINVAL;
+ /* The bucket number can only be 0..31, and all are valid. */
+ struct pthread_key_bucket **buckets = GL(dl_pthread_keys_data);
+ struct pthread_key_bucket *bucket = buckets[b];
+ if (bucket == NULL)
+ return EINVAL;
+ if (i < 0 || i >= bucket->num_slots)
+ return EINVAL;
+ if (bucket->keys[i] == PTHREAD_KEY_SLOT_FREE)
+ return EINVAL;
- level2 = &self->specific_1stblock[key];
+ /* If we need to mmap a bucket, do so now. */
- /* Remember that we stored at least one set of data. */
- if (value != NULL)
- THREAD_SETMEM (self, specific_used, true);
- }
- else
- {
- if (key >= PTHREAD_KEYS_MAX
- || KEY_UNUSED ((seq = __pthread_keys[key].seq)))
- /* Not valid. */
- return EINVAL;
+ buckets = THREAD_SELF -> pthread_key_buckets;
+ bucket = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, b);
- idx1st = key / PTHREAD_KEY_2NDLEVEL_SIZE;
- idx2nd = key % PTHREAD_KEY_2NDLEVEL_SIZE;
+ if (bucket == NULL)
+ {
+ size_t mlen = bucket2mlen (b);
+ size_t slots = bucket2slots (b);
+ struct pthread_key_bucket *m;
- /* This is the second level array. Allocate it if necessary. */
- level2 = THREAD_GETMEM_NC (self, specific, idx1st);
- if (level2 == NULL)
+ if (b == 0)
{
- if (value == NULL)
- /* We don't have to do anything. The value would in any case
- be NULL. We can save the memory allocation. */
- return 0;
-
- level2
- = (struct pthread_key_data *) calloc (PTHREAD_KEY_2NDLEVEL_SIZE,
- sizeof (*level2));
- if (level2 == NULL)
+ m = (struct pthread_key_bucket *) & THREAD_SELF->specific_1stblock_dj;
+ slots = PTHREAD_KEY_ZEROBLOCK_COUNT;
+ }
+ else
+ {
+ m = mmap (0, mlen, PROT_READ|PROT_WRITE,
+ MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
+ if (m == MAP_FAILED)
return ENOMEM;
-
- THREAD_SETMEM_NC (self, specific, idx1st, level2);
}
- /* Pointer to the right array element. */
- level2 = &level2[idx2nd];
+ m->num_slots = slots;
+ m->next_free_slot_hint = 0;
+ memset (m->keys, 0, slots * sizeof (void *));
- /* Remember that we stored at least one set of data. */
- THREAD_SETMEM (self, specific_used, true);
+ bucket = buckets[b] = m;
+ THREAD_SETMEM_NC (THREAD_SELF, pthread_key_buckets, b, m);
}
- /* Store the data and the sequence number so that we can recognize
- stale data. */
- level2->seq = seq;
- level2->data = (void *) value;
-
+ THREAD_SETMEM (THREAD_SELF, specific_used, 1);
+ bucket->keys[i] = (void *) value;
return 0;
}
versioned_symbol (libc, ___pthread_setspecific, pthread_setspecific,
new file mode 100644
@@ -0,0 +1,70 @@
+/* Verify that keys are in sync across namespaces.
+ Copyright (C) 2026 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <support/check.h>
+#include <support/xdlfcn.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <pthread.h>
+
+/* This file loads ns1.so in a separate DSO namespace, and uses it to
+ create keys in that namespace, and compares to keys created in this
+ namespace. */
+
+#define NUM_KEYS 30
+static pthread_key_t our_keys[NUM_KEYS];
+static pthread_key_t ns_keys[NUM_KEYS];
+
+static int (*ns_pthread_key_create)(pthread_key_t *key,
+ void (*__destr_function) (void *));
+
+static int
+do_test (void)
+{
+ int i, j, errors=0;
+ void *so;
+
+ so = xdlmopen (LM_ID_NEWLM, "tst-pthread-keys-ns1.so", RTLD_NOW);
+ ns_pthread_key_create = dlsym (so, "ns_pthread_key_create");
+
+ for (i=0; i<NUM_KEYS; i++)
+ {
+ TEST_VERIFY (pthread_key_create (&our_keys[i], NULL) == 0);
+ TEST_VERIFY (ns_pthread_key_create (&ns_keys[i], NULL) == 0);
+
+ printf(" %08x %08x\n", our_keys[i], ns_keys[i]);
+ }
+
+ for (i=0; i<NUM_KEYS; i++)
+ for (j=0; j<NUM_KEYS; j++)
+ {
+ if (our_keys[i] == ns_keys[j])
+ {
+ if (errors < 5)
+ printf("collision %x[%d] %x[%d]\n", our_keys[i], i, ns_keys[j], j);
+ errors ++;
+ }
+ }
+
+ xdlclose (so);
+
+ return errors;
+}
+
+#include <support/test-driver.c>
new file mode 100644
@@ -0,0 +1,29 @@
+/* Verify that keys are in sync across namespaces.
+ Copyright (C) 2026 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <pthread.h>
+#include <support/check.h>
+
+/* This file just connects the test case to a glibc in a different DSO
+ namespace. */
+
+int ns_pthread_key_create (pthread_key_t *key,
+ void (*__destr_function) (void *))
+{
+ return pthread_key_create (key, __destr_function);
+}
@@ -56,7 +56,6 @@ DB_STRUCT_FIELD (pthread, start_routine)
DB_STRUCT_FIELD (pthread, cancelhandling)
DB_STRUCT_FIELD (pthread, schedpolicy)
DB_STRUCT_FIELD (pthread, schedparam_sched_priority)
-DB_STRUCT_FIELD (pthread, specific)
DB_STRUCT_FIELD (pthread, eventbuf)
DB_STRUCT_FIELD (pthread, eventbuf_eventmask)
DB_STRUCT_ARRAY_FIELD (pthread, eventbuf_eventmask_event_bits)
@@ -81,17 +80,10 @@ DB_VARIABLE (__nptl_nthreads)
DB_VARIABLE (__nptl_last_event)
DB_MAIN_VARIABLE (__nptl_initial_report_events)
-DB_ARRAY_VARIABLE (__pthread_keys)
DB_STRUCT (pthread_key_struct)
DB_STRUCT_FIELD (pthread_key_struct, seq)
DB_STRUCT_FIELD (pthread_key_struct, destr)
-DB_STRUCT (pthread_key_data)
-DB_STRUCT_FIELD (pthread_key_data, seq)
-DB_STRUCT_FIELD (pthread_key_data, data)
-DB_STRUCT (pthread_key_data_level2)
-DB_STRUCT_ARRAY_FIELD (pthread_key_data_level2, data)
-
DB_STRUCT_FIELD (link_map, l_tls_modid)
DB_STRUCT_FIELD (link_map, l_tls_offset)
@@ -24,6 +24,9 @@ td_err_e
td_ta_tsd_iter (const td_thragent_t *ta_arg, td_key_iter_f *callback,
void *cbdata_p)
{
+#if 1
+ return TD_ERR;
+#else
td_thragent_t *const ta = (td_thragent_t *) ta_arg;
td_err_e err;
void *keys;
@@ -77,4 +80,5 @@ td_ta_tsd_iter (const td_thragent_t *ta_arg, td_key_iter_f *callback,
}
return TD_OK;
+#endif
}
@@ -59,11 +59,6 @@ td_thr_get_info (const td_thrhandle_t *th, td_thrinfo_t *infop)
if (err != TD_OK)
return err;
- err = DB_GET_FIELD_ADDRESS (tls, th->th_ta_p, th->th_unique,
- pthread, specific, 0);
- if (err != TD_OK)
- return err;
-
err = DB_GET_FIELD_LOCAL (schedpolicy, th->th_ta_p, copy, pthread,
schedpolicy, 0);
if (err != TD_OK)
@@ -18,11 +18,17 @@
#include <stdint.h>
#include "thread_dbP.h"
+#include <stdio.h>
td_err_e
td_thr_tsd (const td_thrhandle_t *th, const thread_key_t tk, void **data)
{
+#if 1
+ /* As far as I can tell, this interface isn't used by anything, the
+ internals are not documented, and there's no testsuite. - DJ */
+ return TD_ERR;
+#else
td_err_e err;
psaddr_t tk_seq, level1, level2, seq, value;
void *copy;
@@ -92,4 +98,5 @@ td_thr_tsd (const td_thrhandle_t *th, const thread_key_t tk, void **data)
*data = value;
return err;
+#endif
}
@@ -467,6 +467,10 @@ struct rtld_global
/* Mutex protecting the stack lists. */
EXTERN int _dl_stack_cache_lock;
+
+ /* Pthread keys global destructor tables. Actually a pointer to
+ pthread_key_bucket[32]. */
+ EXTERN void *_dl_pthread_keys_data;
#endif
#if __PTHREAD_HTL
/* The total number of thread IDs currently in use, or on the list of
@@ -1455,6 +1459,7 @@ __rtld_mutex_init (void)
/* The initialization happens later (!__PHREAD_NPTL) or is not
needed at all (!SHARED). */
}
+
#endif /* !__PHREAD_NPTL */
/* Implementation of GL (dl_libc_freeres). */
@@ -74,7 +74,7 @@ __tls_init_tp (void)
/* Early initialization of the TCB. */
pd->tid = INTERNAL_SYSCALL_CALL (set_tid_address, &pd->joinstate);
- THREAD_SETMEM (pd, specific[0], &pd->specific_1stblock[0]);
+ _pthread_key_init (pd);
THREAD_SETMEM (pd, stack_mode, ALLOCATE_GUARD_USER);
THREAD_SETMEM (pd, joinstate, THREAD_STATE_JOINABLE);
@@ -113,22 +113,22 @@ reclaim_stacks (void)
if (curp->specific_used)
{
- /* Clear the thread-specific data. */
- memset (curp->specific_1stblock, '\0',
- sizeof (curp->specific_1stblock));
-
curp->specific_used = false;
- for (size_t cnt = 1; cnt < PTHREAD_KEY_1STLEVEL_SIZE; ++cnt)
- if (curp->specific[cnt] != NULL)
- {
- memset (curp->specific[cnt], '\0',
- sizeof (curp->specific_1stblock));
-
- /* We have allocated the block which we do not
- free here so re-set the bit. */
- curp->specific_used = true;
- }
+ for (size_t cnt = 0; cnt < PTHREAD_KEY_1STLEVEL_SIZE; ++cnt)
+ {
+ pthread_key_bucket *b = curp->pthread_key_buckets[cnt];
+ if (b != NULL)
+ {
+ memset (b->keys, '\0',
+ b->num_slots * sizeof (b->keys[0]));
+
+ /* We have allocated the block which we do not
+ free here so re-set the bit. */
+ if (cnt > 0)
+ curp->specific_used = true;
+ }
+ }
}
call_function_static_weak (__getrandom_reset_state, curp);
@@ -184,9 +184,29 @@ extern int __attr_list_lock attribute_hidden;
/* Concurrency handling. */
extern int __concurrency_level attribute_hidden;
-/* Thread-local data key handling. */
-extern struct pthread_key_struct __pthread_keys[PTHREAD_KEYS_MAX];
-libc_hidden_proto (__pthread_keys)
+/* Thread-local data key handling. Must match struct in
+ dl-pthread-keys.c. */
+typedef struct pthread_key_bucket {
+ size_t num_slots;
+ size_t next_free_slot_hint; /* unused */
+ void *keys[1]; /* Not actually 1 but nptl_db needs something concrete. */
+} pthread_key_bucket;
+
+#define PTHREAD_KEY_ENCODE(b,i) ((pthread_key_t) (b | (i<<5)))
+#define PTHREAD_KEY_DECODE_BUCKET(k) (((size_t)k) & 31)
+#define PTHREAD_KEY_DECODE_SLOT(k) (((size_t)k) >> 5)
+#define PTHREAD_KEY_SLOT_FREE (void (*)(void *)) (-1UL)
+#if IS_IN(rtld)
+#define PTHREAD_KEY_LOCK
+#define PTHREAD_KEY_UNLOCK
+#else
+__libc_lock_define_recursive (extern, __pthread_key_lock attribute_hidden);
+#define PTHREAD_KEY_LOCK __libc_lock_lock_recursive (__pthread_key_lock)
+#define PTHREAD_KEY_UNLOCK __libc_lock_unlock_recursive (__pthread_key_lock)
+#endif
+
+extern void _pthread_key_init (struct pthread *);
+libc_hidden_proto (_pthread_key_init);
/* Number of threads running. */
extern unsigned int __nptl_nthreads;
@@ -38,25 +38,25 @@ do_test (void)
THREAD_SETMEM (THREAD_SELF, header.ssp_base, saved_ssp_base);
#ifndef __ILP32__
- struct pthread_key_data *saved_specific, *specific;
- saved_specific = THREAD_GETMEM_NC (THREAD_SELF, specific, 1);
+ struct pthread_key_bucket *saved_pthread_key_buckets, *pthread_key_buckets;
+ saved_pthread_key_buckets = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, 1);
uintptr_t value = (1UL << 57) - 1;
- THREAD_SETMEM_NC (THREAD_SELF, specific, 1,
- (struct pthread_key_data *) value);
- specific = THREAD_GETMEM_NC (THREAD_SELF, specific, 1);
- if (specific != (struct pthread_key_data *) value)
+ THREAD_SETMEM_NC (THREAD_SELF, pthread_key_buckets, 1,
+ (struct pthread_key_bucket *) value);
+ pthread_key_buckets = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, 1);
+ if (pthread_key_buckets != (struct pthread_key_bucket *) value)
FAIL_EXIT1 ("THREAD_GETMEM_NC: %p != %p",
- specific, (struct pthread_key_data *) value);
+ pthread_key_buckets, (struct pthread_key_bucket *) value);
- THREAD_SETMEM_NC (THREAD_SELF, specific, 1,
- (struct pthread_key_data *) -1UL);
- specific = THREAD_GETMEM_NC (THREAD_SELF, specific, 1);
- if (specific != (struct pthread_key_data *) -1UL)
+ THREAD_SETMEM_NC (THREAD_SELF, pthread_key_buckets, 1,
+ (struct pthread_key_bucket *) -1UL);
+ pthread_key_buckets = THREAD_GETMEM_NC (THREAD_SELF, pthread_key_buckets, 1);
+ if (pthread_key_buckets != (struct pthread_key_bucket *) -1UL)
FAIL_EXIT1 ("THREAD_GETMEM_NC: %p != %p",
- specific, (struct pthread_key_data *) -1UL);
+ pthread_key_buckets, (struct pthread_key_bucket *) -1UL);
- THREAD_SETMEM_NC (THREAD_SELF, specific, 1, saved_specific);
+ THREAD_SETMEM_NC (THREAD_SELF, pthread_key_buckets, 1, saved_pthread_key_buckets);
#endif
return 0;
}