[v6,3/3] RISC-V: Implement TLS Descriptors.

Message ID 20240329061834.40019-4-ishitatsuyuki@gmail.com
State New
Headers
Series RISC-V: Implement TLS Descriptors. |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Testing passed

Commit Message

Tatsuyuki Ishi March 29, 2024, 6:18 a.m. UTC
  This is mostly based off AArch64 implementation, with some adaptations to
different TLS DTV offsets and calling conventions.

As we have not officially committed to a vector calling convention, all
vector registers are saved in the calling convention wrapper. This can be
revisited once we decide which registers will be callee-saved.
---
 sysdeps/riscv/Makefile                      |  10 +
 sysdeps/riscv/dl-lookupcfg.h                |  27 ++
 sysdeps/riscv/dl-machine.h                  |  50 +++-
 sysdeps/riscv/dl-tlsdesc.S                  | 269 ++++++++++++++++++++
 sysdeps/riscv/dl-tlsdesc.h                  |  48 ++++
 sysdeps/riscv/linkmap.h                     |   1 +
 sysdeps/riscv/preconfigure                  |   1 +
 sysdeps/riscv/tlsdesc.c                     |  38 +++
 sysdeps/riscv/tlsdesc.sym                   |  19 ++
 sysdeps/riscv/tst-gnu2-tls2.c               |  33 +++
 sysdeps/unix/sysv/linux/riscv/localplt.data |   2 +
 11 files changed, 497 insertions(+), 1 deletion(-)
 create mode 100644 sysdeps/riscv/dl-lookupcfg.h
 create mode 100644 sysdeps/riscv/dl-tlsdesc.S
 create mode 100644 sysdeps/riscv/dl-tlsdesc.h
 create mode 100644 sysdeps/riscv/tlsdesc.c
 create mode 100644 sysdeps/riscv/tlsdesc.sym
 create mode 100644 sysdeps/riscv/tst-gnu2-tls2.c
  

Comments

Florian Weimer April 1, 2024, 1:23 p.m. UTC | #1
* Tatsuyuki Ishi:

> diff --git a/sysdeps/unix/sysv/linux/riscv/localplt.data b/sysdeps/unix/sysv/linux/riscv/localplt.data
> index ea887042e0..01710df22d 100644
> --- a/sysdeps/unix/sysv/linux/riscv/localplt.data
> +++ b/sysdeps/unix/sysv/linux/riscv/localplt.data
> @@ -6,3 +6,5 @@ libc.so: free
>  libc.so: malloc
>  libc.so: memset ?
>  libc.so: realloc
> +# The dynamic loader needs __tls_get_addr for TLS.
> +ld.so: __tls_get_addr

This shouldn't be needed if you add a proper hidden alias.

Thanks,
Florian
  
Adhemerval Zanella Netto April 1, 2024, 7:29 p.m. UTC | #2
On 29/03/24 03:18, Tatsuyuki Ishi wrote:
> This is mostly based off AArch64 implementation, with some adaptations to
> different TLS DTV offsets and calling conventions.
> 
> As we have not officially committed to a vector calling convention, all
> vector registers are saved in the calling convention wrapper. This can be
> revisited once we decide which registers will be callee-saved.
> ---

> +/* The fast path does not call function and does not need to align sp, but
> +   to simplify handling when going into the slow path, keep sp aligned all
> +   the time.
> + */
> +#define FRAME_SIZE_FAST (-((-3 * SZREG) & ALMASK))
> +
> +/* The slow path save slot layout, from lower address to higher address, is:
> +   1. 32 vector registers
> +   2. 12 GP registers
> +   3. 20 FP registers
> +   4. 3 vector CSR registers
> +
> +   1. has machine-dependent size, and hence is not included in FRAME_SIZE_SLOW.
> +   Additionally, the vector register save area needs to be naturally aligned:
> +   this is satisfied as a side effect of 16-byte stack alignment.
> +   The size of vector save area, OTOH, also needs to satisfy stack alignment, as
> +   implementations can have vector registers smaller than 16 bytes.
> +   For now, the size is guaranteed to be a multiple of 16 as we save all 32 vector registers.
> + */
> +#if defined(__riscv_float_abi_soft)
> +# define FRAME_SIZE_SLOW (-((-12 * SZREG) & ALMASK))
> +#elif defined(__riscv_vector)
> +# define FRAME_SIZE_SLOW (-((-15 * SZREG - 20 * SZFREG) & ALMASK))

We already have 6 different RISC-V abis on build-many-glibcs.py, plus
the ZBB/XTHREADB usage on string-fza.h. With this we will another 
sub-variant we will need to build/check, which will make RISC-V even
more MIPS-like with its unfeasible number of ABIs.

Maybe a better option, now that glibc has internally riscv_hwprobe
support and that RVV is only support for 6.5, to use instead of adding
another ABI variant. 

It could either through ifunc variants, like x86, or by embedding the
ABI check within the _dl_tlsdesc_dynamic, like ARM.
  
Tatsuyuki Ishi April 2, 2024, 3:36 a.m. UTC | #3
> On Apr 2, 2024, at 4:29, Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> wrote:
> 
> Maybe a better option, now that glibc has internally riscv_hwprobe
> support and that RVV is only support for 6.5, to use instead of adding
> another ABI variant. 

Does this mean that it can be assumed that RVV code will only be executed on an environment that also supports riscv_hwprobe? In that case, I agree that we should switch the RVV path to use feature detection.

I suppose the softfp path will remain the same since not all softfp environment will support hwprobe per the reasoning.

Tatsuyuki.
  
Adhemerval Zanella Netto April 2, 2024, 1:35 p.m. UTC | #4
On 02/04/24 00:36, Tatsuyuki Ishi wrote:
>> On Apr 2, 2024, at 4:29, Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> wrote:
>>
>> Maybe a better option, now that glibc has internally riscv_hwprobe
>> support and that RVV is only support for 6.5, to use instead of adding
>> another ABI variant. 
> 
> Does this mean that it can be assumed that RVV code will only be executed on an environment that also supports riscv_hwprobe? In that case, I agree that we should switch the RVV path to use feature detection.

Unless RVV support is backported without backporting riscv_hwprobe as well,
although I expected that RISC-V maintainer would avoid it because the
whole riscv_hwprobe is to enable runtime selection for RVV and alike
features.

> 
> I suppose the softfp path will remain the same since not all softfp environment will support hwprobe per the reasoning.

I take that softfp is a de-facto ABI for RISC-V, at least this what we
have on build-many-glibcs.py:

  riscv64-linux-gnu-rv64imac-lp64
  riscv64-linux-gnu-rv64imafdc-lp64
  riscv64-linux-gnu-rv64imafdc-lp64d

I am not sure whether RISC-V maintainer would like to move to profilers
or would keep this extensions composability manner.

> 
> Tatsuyuki.
  
Palmer Dabbelt April 2, 2024, 3:25 p.m. UTC | #5
On Tue, 02 Apr 2024 06:35:21 PDT (-0700), adhemerval.zanella@linaro.org wrote:
>
>
> On 02/04/24 00:36, Tatsuyuki Ishi wrote:
>>> On Apr 2, 2024, at 4:29, Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> wrote:
>>>
>>> Maybe a better option, now that glibc has internally riscv_hwprobe
>>> support and that RVV is only support for 6.5, to use instead of adding
>>> another ABI variant.

We might end up with a new ABI variant for the hard-vector calling 
convention (ie, passing vector registers in arguments)
at some point, but that's a little way off.  We don't even support it as 
a base ABI in GCC yet, and it's too late to do something like that for 
14.

Either way we'd need to make tlsdesc work on systems with the V 
extension and the soft-vector ABI, which means dynamically probing for 
it.

>> Does this mean that it can be assumed that RVV code will only be executed on an environment that also supports riscv_hwprobe? In that case, I agree that we should switch the RVV path to use feature detection.
>
> Unless RVV support is backported without backporting riscv_hwprobe as well,
> although I expected that RISC-V maintainer would avoid it because the
> whole riscv_hwprobe is to enable runtime selection for RVV and alike
> features.

Ya, that seems like a way to just make headaches.  V is particularly 
tricky because it has use-visible state and thus everything needs to 
know about it to work.

There's also a prctl() for enabling/disabling the V extension at 
runtime that we'll need to check, but that's in the same spot.

>> I suppose the softfp path will remain the same since not all softfp environment will support hwprobe per the reasoning.

hwprobe doesn't change anything about the soft float ABI, that needs to 
be linked with compatible objects in order to function correctly.  
That's the same spot we'd end up in for the hard-vector ABI, if we added 
one.

> I take that softfp is a de-facto ABI for RISC-V, at least this what we
> have on build-many-glibcs.py:
>
>   riscv64-linux-gnu-rv64imac-lp64
>   riscv64-linux-gnu-rv64imafdc-lp64
>   riscv64-linux-gnu-rv64imafdc-lp64d

Ya, we have soft-float and double-float ABIs in glibc.  There's also a 
single-float ABI in GCC, but we decided not to add that to glibc to 
avoid the complexity.

> I am not sure whether RISC-V maintainer would like to move to profilers
> or would keep this extensions composability manner.

What do you mean by "move to profilers"?  "move to profiles"?

We have the profiles in RISC-V, but they don't really fix anything here: 
there's a ton of them, they still have various optional extensions, and 
the HW support is all over the place (we don't even have compatibility 
guarantees with individual extensions, for example).

So I think for just sticking to the current base ABIs is the way to go.  
Maybe at some point HW vendors will start shipping systems that are 
compatible with some common extension set and we can promote that to a 
new base ABI, but we're a way off from that happening.

>
>>
>> Tatsuyuki.
  
Adhemerval Zanella Netto April 2, 2024, 3:32 p.m. UTC | #6
On 02/04/24 12:25, Palmer Dabbelt wrote:

> 
>> I am not sure whether RISC-V maintainer would like to move to profilers
>> or would keep this extensions composability manner.
> 
> What do you mean by "move to profilers"?  "move to profiles"?
> 
> We have the profiles in RISC-V, but they don't really fix anything here: there's a ton of them, they still have various optional extensions, and the HW support is all over the place (we don't even have compatibility guarantees with individual extensions, for example).
> 
> So I think for just sticking to the current base ABIs is the way to go.  Maybe at some point HW vendors will start shipping systems that are compatible with some common extension set and we can promote that to a new base ABI, but we're a way off from that happening.

Indeed I meant 'profiles' here and my understanding was that something like
RVA22U64 as base would simplify things a bit (at least with the possible
testing/checking the multiple build permutations that using optional
extensions would incur).
  
Palmer Dabbelt April 2, 2024, 4:37 p.m. UTC | #7
On Tue, 02 Apr 2024 08:32:59 PDT (-0700), adhemerval.zanella@linaro.org wrote:
>
>
> On 02/04/24 12:25, Palmer Dabbelt wrote:
>
>>
>>> I am not sure whether RISC-V maintainer would like to move to profilers
>>> or would keep this extensions composability manner.
>>
>> What do you mean by "move to profilers"?  "move to profiles"?
>>
>> We have the profiles in RISC-V, but they don't really fix anything here: there's a ton of them, they still have various optional extensions, and the HW support is all over the place (we don't even have compatibility guarantees with individual extensions, for example).
>>
>> So I think for just sticking to the current base ABIs is the way to go.  Maybe at some point HW vendors will start shipping systems that are compatible with some common extension set and we can promote that to a new base ABI, but we're a way off from that happening.
>
> Indeed I meant 'profiles' here and my understanding was that something like
> RVA22U64 as base would simplify things a bit (at least with the possible
> testing/checking the multiple build permutations that using optional
> extensions would incur).

If we could get all the HW vendors to agree to something it wouldn't 
hurt, but trying to move to one of the newer profiles would mean 
dropping support for existing hardware (and thus breaking users who are 
stuck with it, a lot of this newer hardware seems pretty buggy right 
now).  Vendors seem to be implementing stuff that's close to the 
profiles, but it's still a bit of a mess -- plus we end up with the 
whole vendor self-certification issue, which means it's really hard to 
tell what HW actually does from the marketing material.

I'm hoping that something like Android will help here, as there we'll 
have people who are actually shipping systems defining the compatibility 
requirements.  With any luck that will start getting these issues at 
least understood over at RVI, but it's still going to be a long process 
before we get stuff sane.
  

Patch

diff --git a/sysdeps/riscv/Makefile b/sysdeps/riscv/Makefile
index c08753ae8a..fc16081cde 100644
--- a/sysdeps/riscv/Makefile
+++ b/sysdeps/riscv/Makefile
@@ -4,6 +4,16 @@  endif
 
 ifeq ($(subdir),elf)
 gen-as-const-headers += dl-link.sym
+sysdep-dl-routines += \
+  dl-tlsdesc \
+  tlsdesc \
+  # routines
+endif
+
+ifeq ($(subdir),csu)
+gen-as-const-headers += \
+  tlsdesc.sym \
+  # gen-as-const-headers
 endif
 
 # RISC-V's assembler also needs to know about PIC as it changes the definition
diff --git a/sysdeps/riscv/dl-lookupcfg.h b/sysdeps/riscv/dl-lookupcfg.h
new file mode 100644
index 0000000000..d75a48f50c
--- /dev/null
+++ b/sysdeps/riscv/dl-lookupcfg.h
@@ -0,0 +1,27 @@ 
+/* Configuration of lookup functions.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#define DL_UNMAP_IS_SPECIAL
+
+#include_next <dl-lookupcfg.h>
+
+struct link_map;
+
+extern void _dl_unmap (struct link_map *map);
+
+#define DL_UNMAP(map) _dl_unmap (map)
diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h
index b2f28697f7..3d5e63040d 100644
--- a/sysdeps/riscv/dl-machine.h
+++ b/sysdeps/riscv/dl-machine.h
@@ -25,6 +25,7 @@ 
 #include <elf/elf.h>
 #include <sys/asm.h>
 #include <dl-tls.h>
+#include <dl-tlsdesc.h>
 #include <dl-irel.h>
 #include <dl-static-tls.h>
 #include <dl-machine-rel.h>
@@ -50,7 +51,8 @@ 
      || (__WORDSIZE == 32 && (type) == R_RISCV_TLS_TPREL32)	\
      || (__WORDSIZE == 64 && (type) == R_RISCV_TLS_DTPREL64)	\
      || (__WORDSIZE == 64 && (type) == R_RISCV_TLS_DTPMOD64)	\
-     || (__WORDSIZE == 64 && (type) == R_RISCV_TLS_TPREL64)))	\
+     || (__WORDSIZE == 64 && (type) == R_RISCV_TLS_TPREL64)	\
+     || ((type) == R_RISCV_TLSDESC)))				\
    | (ELF_RTYPE_CLASS_COPY * ((type) == R_RISCV_COPY)))
 
 /* Return nonzero iff ELF header is compatible with the running host.  */
@@ -219,6 +221,34 @@  elf_machine_rela (struct link_map *map, struct r_scope_elem *scope[],
 	}
       break;
 
+    case R_RISCV_TLSDESC:
+      struct tlsdesc *td = (struct tlsdesc *) addr_field;
+      if (sym == NULL)
+	{
+	  td->entry = _dl_tlsdesc_undefweak;
+	  td->arg = (void *) reloc->r_addend;
+	}
+      else
+	{
+# ifndef SHARED
+	  CHECK_STATIC_TLS (map, sym_map);
+# else
+	  if (!TRY_STATIC_TLS (map, sym_map))
+	    {
+	      td->entry = _dl_tlsdesc_dynamic;
+	      td->arg = _dl_make_tlsdesc_dynamic (
+		  sym_map, sym->st_value + reloc->r_addend);
+	    }
+	  else
+# endif
+	    {
+	      td->entry = _dl_tlsdesc_return;
+	      td->arg
+		  = (void *) (TLS_TPREL_VALUE (sym_map, sym) + reloc->r_addend);
+	    }
+	}
+      break;
+
     case R_RISCV_COPY:
       {
 	if (__glibc_unlikely (sym == NULL))
@@ -289,6 +319,24 @@  elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[],
       else
 	*reloc_addr = map->l_mach.plt;
     }
+  else if (__glibc_likely (r_type == R_RISCV_TLSDESC))
+    {
+      const Elf_Symndx symndx = ELFW (R_SYM) (reloc->r_info);
+      const ElfW (Sym) *symtab = (const void *)D_PTR (map, l_info[DT_SYMTAB]);
+      const ElfW (Sym) *sym = &symtab[symndx];
+      const struct r_found_version *version = NULL;
+
+      if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+	{
+	  const ElfW (Half) *vernum =
+	      (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
+	  version = &map->l_versions[vernum[symndx] & 0x7fff];
+	}
+
+      /* Always initialize TLS descriptors completely, because lazy
+	 initialization requires synchronization at every TLS access.  */
+      elf_machine_rela (map, scope, reloc, sym, version, reloc_addr, skip_ifunc);
+    }
   else if (__glibc_unlikely (r_type == R_RISCV_IRELATIVE))
     {
       ElfW(Addr) value = map->l_addr + reloc->r_addend;
diff --git a/sysdeps/riscv/dl-tlsdesc.S b/sysdeps/riscv/dl-tlsdesc.S
new file mode 100644
index 0000000000..69acdb6428
--- /dev/null
+++ b/sysdeps/riscv/dl-tlsdesc.S
@@ -0,0 +1,269 @@ 
+/* Thread-local storage handling in the ELF dynamic linker.
+   RISC-V version.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <tls.h>
+#include <tlsdesc.h>
+
+/* The fast path does not call function and does not need to align sp, but
+   to simplify handling when going into the slow path, keep sp aligned all
+   the time.
+ */
+#define FRAME_SIZE_FAST (-((-3 * SZREG) & ALMASK))
+
+/* The slow path save slot layout, from lower address to higher address, is:
+   1. 32 vector registers
+   2. 12 GP registers
+   3. 20 FP registers
+   4. 3 vector CSR registers
+
+   1. has machine-dependent size, and hence is not included in FRAME_SIZE_SLOW.
+   Additionally, the vector register save area needs to be naturally aligned:
+   this is satisfied as a side effect of 16-byte stack alignment.
+   The size of vector save area, OTOH, also needs to satisfy stack alignment, as
+   implementations can have vector registers smaller than 16 bytes.
+   For now, the size is guaranteed to be a multiple of 16 as we save all 32 vector registers.
+ */
+#if defined(__riscv_float_abi_soft)
+# define FRAME_SIZE_SLOW (-((-12 * SZREG) & ALMASK))
+#elif defined(__riscv_vector)
+# define FRAME_SIZE_SLOW (-((-15 * SZREG - 20 * SZFREG) & ALMASK))
+#else
+# define FRAME_SIZE_SLOW (-((-12 * SZREG - 20 * SZFREG) & ALMASK))
+#endif
+
+	.text
+
+	/* Compute the thread pointer offset for symbols in the static
+	   TLS block. The offset is the same for all threads.
+	   Prototype:
+	   _dl_tlsdesc_return (tlsdesc *) ;
+	 */
+ENTRY (_dl_tlsdesc_return)
+	REG_L a0, TLSDESC_ARG(a0)
+	jr t0
+END (_dl_tlsdesc_return)
+
+	/* Handler for undefined weak TLS symbols.
+	   Prototype:
+	   _dl_tlsdesc_undefweak (tlsdesc *);
+
+	   The second word of the descriptor contains the addend.
+	   Return the addend minus the thread pointer. This ensures
+	   that when the caller adds on the thread pointer it gets back
+	   the addend.  */
+
+ENTRY (_dl_tlsdesc_undefweak)
+	REG_L a0, TLSDESC_ARG(a0)
+	sub a0, a0, tp
+	jr t0
+END (_dl_tlsdesc_undefweak)
+
+#ifdef SHARED
+	/* Handler for dynamic TLS symbols.
+	   Prototype:
+	   _dl_tlsdesc_dynamic (tlsdesc *) ;
+
+	   The second word of the descriptor points to a
+	   tlsdesc_dynamic_arg structure.
+
+	   Returns the offset between the thread pointer and the
+	   object referenced by the argument.
+
+	   unsigned long
+	   _dl_tlsdesc_dynamic (struct tlsdesc *tdp)
+	   {
+	     struct tlsdesc_dynamic_arg *td = tdp->arg;
+	     dtv_t *dtv = *(dtv_t **)((char *)__thread_pointer + TCBHEAD_DTV);
+	     if (__builtin_expect (td->gen_count <= dtv[0].counter
+		&& (dtv[td->tlsinfo.ti_module].pointer.val
+		    != TLS_DTV_UNALLOCATED),
+		1))
+	       return dtv[td->tlsinfo.ti_module].pointer.val
+		+ td->tlsinfo.ti_offset
+		- __thread_pointer;
+
+	     return ___tls_get_addr (&td->tlsinfo) - __thread_pointer;
+	   }
+	 */
+
+ENTRY (_dl_tlsdesc_dynamic)
+	/* Save just enough registers to support fast path, if we fall
+	   into slow path we will save additional registers.  */
+	add	sp, sp, -FRAME_SIZE_FAST
+	REG_S	t0, 0*SZREG(sp)
+	REG_S	t1, 1*SZREG(sp)
+	REG_S	t2, 2*SZREG(sp)
+
+	/* t0 = dtv */
+	REG_L	t0, TCBHEAD_DTV(tp)
+	/* a0 = tdp->arg */
+	REG_L	a0, TLSDESC_ARG(a0)
+	/* t1 = td->gen_count */
+	REG_L	t1, TLSDESC_GEN_COUNT(a0)
+	/* t2 = dtv[0].counter */
+	REG_L	t2, DTV_COUNTER(t0)
+	bltu	t2, t1, .Lslow
+	/* t1 = td->tlsinfo.ti_module */
+	REG_L	t1, TLSDESC_MODID(a0)
+	slli	t1, t1, PTRLOG + 1 /* sizeof(dtv_t) == sizeof(void*) * 2 */
+	add	t1, t1, t0
+	/* t1 = dtv[td->tlsinfo.ti_module].pointer.val  */
+	REG_L	t1, 0(t1)
+	li	t2, TLS_DTV_UNALLOCATED
+	beq	t1, t2, .Lslow
+	/* t2 = td->tlsinfo.ti_offset */
+	REG_L	t2, TLSDESC_MODOFF(a0)
+	add	a0, t1, t2
+.Lret:
+	sub	a0, a0, tp
+	REG_L	t0, 0*SZREG(sp)
+	REG_L	t1, 1*SZREG(sp)
+	REG_L	t2, 2*SZREG(sp)
+	add	sp, sp, FRAME_SIZE_FAST
+	jr	t0
+.Lslow:
+	/* This is the slow path. We need to call __tls_get_addr() which
+	   means we need to save and restore all the register that the
+	   callee will trash.  */
+
+	/* Save the remaining registers that we must treat as caller save.  */
+	addi	sp, sp, -FRAME_SIZE_SLOW
+	REG_S	ra, 0*SZREG(sp)
+	REG_S	a1, 1*SZREG(sp)
+	REG_S	a2, 2*SZREG(sp)
+	REG_S	a3, 3*SZREG(sp)
+	REG_S	a4, 4*SZREG(sp)
+	REG_S	a5, 5*SZREG(sp)
+	REG_S	a6, 6*SZREG(sp)
+	REG_S	a7, 7*SZREG(sp)
+	REG_S	t3, 8*SZREG(sp)
+	REG_S	t4, 9*SZREG(sp)
+	REG_S	t5, 10*SZREG(sp)
+	REG_S	t6, 11*SZREG(sp)
+
+#ifndef __riscv_float_abi_soft
+	FREG_S	ft0, (12*SZREG + 0*SZFREG)(sp)
+	FREG_S	ft1, (12*SZREG + 1*SZFREG)(sp)
+	FREG_S	ft2, (12*SZREG + 2*SZFREG)(sp)
+	FREG_S	ft3, (12*SZREG + 3*SZFREG)(sp)
+	FREG_S	ft4, (12*SZREG + 4*SZFREG)(sp)
+	FREG_S	ft5, (12*SZREG + 5*SZFREG)(sp)
+	FREG_S	ft6, (12*SZREG + 6*SZFREG)(sp)
+	FREG_S	ft7, (12*SZREG + 7*SZFREG)(sp)
+	FREG_S	fa0, (12*SZREG + 8*SZFREG)(sp)
+	FREG_S	fa1, (12*SZREG + 9*SZFREG)(sp)
+	FREG_S	fa2, (12*SZREG + 10*SZFREG)(sp)
+	FREG_S	fa3, (12*SZREG + 11*SZFREG)(sp)
+	FREG_S	fa4, (12*SZREG + 12*SZFREG)(sp)
+	FREG_S	fa5, (12*SZREG + 13*SZFREG)(sp)
+	FREG_S	fa6, (12*SZREG + 14*SZFREG)(sp)
+	FREG_S	fa7, (12*SZREG + 15*SZFREG)(sp)
+	FREG_S	ft8, (12*SZREG + 16*SZFREG)(sp)
+	FREG_S	ft9, (12*SZREG + 17*SZFREG)(sp)
+	FREG_S	ft10, (12*SZREG + 18*SZFREG)(sp)
+	FREG_S	ft11, (12*SZREG + 19*SZFREG)(sp)
+#endif
+
+#ifdef __riscv_vector
+	csrr	t0, vl
+	csrr	t1, vtype
+	csrr	t2, vstart
+	REG_S	t0, (12*SZREG + 20*SZFREG)(sp)
+	REG_S	t1, (13*SZREG + 20*SZFREG)(sp)
+	REG_S	t2, (14*SZREG + 20*SZFREG)(sp)
+
+	csrr	t0, vlenb
+	slli	t1, t0, 5
+	slli	t0, t0, 3
+	sub	sp, sp, t1
+	vs8r.v	v0, (sp)
+	add	sp, sp, t0
+	vs8r.v	v8, (sp)
+	add	sp, sp, t0
+	vs8r.v	v16, (sp)
+	add	sp, sp, t0
+	vs8r.v	v24, (sp)
+	sub	t0, t1, t0
+	sub	sp, sp, t0
+#endif
+
+	call	__tls_get_addr
+	addi	a0, a0, -TLS_DTV_OFFSET
+
+#ifdef __riscv_vector
+	csrr	t0, vlenb
+	slli	t0, t0, 3
+	vl8r.v	v0, (sp)
+	add	sp, sp, t0
+	vl8r.v	v8, (sp)
+	add	sp, sp, t0
+	vl8r.v	v16, (sp)
+	add	sp, sp, t0
+	vl8r.v	v24, (sp)
+	add	sp, sp, t0
+
+	REG_L	t0, (12*SZREG + 20*SZFREG)(sp)
+	REG_L	t1, (13*SZREG + 20*SZFREG)(sp)
+	REG_L	t2, (14*SZREG + 20*SZFREG)(sp)
+	vsetvl	zero, t0, t1
+	csrw	vstart, t2
+#endif
+
+	REG_L	ra, 0*SZREG(sp)
+	REG_L	a1, 1*SZREG(sp)
+	REG_L	a2, 2*SZREG(sp)
+	REG_L	a3, 3*SZREG(sp)
+	REG_L	a4, 4*SZREG(sp)
+	REG_L	a5, 5*SZREG(sp)
+	REG_L	a6, 6*SZREG(sp)
+	REG_L	a7, 7*SZREG(sp)
+	REG_L	t3, 8*SZREG(sp)
+	REG_L	t4, 9*SZREG(sp)
+	REG_L	t5, 10*SZREG(sp)
+	REG_L	t6, 11*SZREG(sp)
+
+#ifndef __riscv_float_abi_soft
+	FREG_L	ft0, (12*SZREG + 0*SZFREG)(sp)
+	FREG_L	ft1, (12*SZREG + 1*SZFREG)(sp)
+	FREG_L	ft2, (12*SZREG + 2*SZFREG)(sp)
+	FREG_L	ft3, (12*SZREG + 3*SZFREG)(sp)
+	FREG_L	ft4, (12*SZREG + 4*SZFREG)(sp)
+	FREG_L	ft5, (12*SZREG + 5*SZFREG)(sp)
+	FREG_L	ft6, (12*SZREG + 6*SZFREG)(sp)
+	FREG_L	ft7, (12*SZREG + 7*SZFREG)(sp)
+	FREG_L	fa0, (12*SZREG + 8*SZFREG)(sp)
+	FREG_L	fa1, (12*SZREG + 9*SZFREG)(sp)
+	FREG_L	fa2, (12*SZREG + 10*SZFREG)(sp)
+	FREG_L	fa3, (12*SZREG + 11*SZFREG)(sp)
+	FREG_L	fa4, (12*SZREG + 12*SZFREG)(sp)
+	FREG_L	fa5, (12*SZREG + 13*SZFREG)(sp)
+	FREG_L	fa6, (12*SZREG + 14*SZFREG)(sp)
+	FREG_L	fa7, (12*SZREG + 15*SZFREG)(sp)
+	FREG_L	ft8, (12*SZREG + 16*SZFREG)(sp)
+	FREG_L	ft9, (12*SZREG + 17*SZFREG)(sp)
+	FREG_L	ft10, (12*SZREG + 18*SZFREG)(sp)
+	FREG_L	ft11, (12*SZREG + 19*SZFREG)(sp)
+#endif
+
+	addi	sp, sp, FRAME_SIZE_SLOW
+	j	.Lret
+END (_dl_tlsdesc_dynamic)
+#endif
diff --git a/sysdeps/riscv/dl-tlsdesc.h b/sysdeps/riscv/dl-tlsdesc.h
new file mode 100644
index 0000000000..0c9b83f43d
--- /dev/null
+++ b/sysdeps/riscv/dl-tlsdesc.h
@@ -0,0 +1,48 @@ 
+/* Thread-local storage descriptor handling in the ELF dynamic linker.
+   RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _DL_TLSDESC_H
+# define _DL_TLSDESC_H 1
+
+#include <dl-tls.h>
+
+/* Type used to represent a TLS descriptor in the GOT.  */
+struct tlsdesc
+{
+  unsigned long (*entry) (struct tlsdesc *);
+  void *arg;
+};
+
+/* Type used as the argument in a TLS descriptor for a symbol that
+   needs dynamic TLS offsets.  */
+struct tlsdesc_dynamic_arg
+{
+  tls_index tlsinfo;
+  size_t gen_count;
+};
+
+extern unsigned long _dl_tlsdesc_return (struct tlsdesc *) attribute_hidden;
+extern unsigned long _dl_tlsdesc_undefweak (struct tlsdesc *) attribute_hidden;
+
+# ifdef SHARED
+extern void *_dl_make_tlsdesc_dynamic (struct link_map *, size_t);
+extern unsigned long _dl_tlsdesc_dynamic (struct tlsdesc *) attribute_hidden;
+# endif
+
+#endif /* _DL_TLSDESC_H */
diff --git a/sysdeps/riscv/linkmap.h b/sysdeps/riscv/linkmap.h
index ac170bb342..2fa3f6d43f 100644
--- a/sysdeps/riscv/linkmap.h
+++ b/sysdeps/riscv/linkmap.h
@@ -1,4 +1,5 @@ 
 struct link_map_machine
   {
     ElfW(Addr) plt; /* Address of .plt.  */
+    void *tlsdesc_table; /* Address of TLS descriptor hash table.  */
   };
diff --git a/sysdeps/riscv/preconfigure b/sysdeps/riscv/preconfigure
index a5de5ccb7d..493d7d98f5 100644
--- a/sysdeps/riscv/preconfigure
+++ b/sysdeps/riscv/preconfigure
@@ -57,6 +57,7 @@  riscv*)
 
     base_machine=riscv
     machine=riscv/rv$xlen/$float_machine
+    mtls_descriptor=desc
 
     printf "%s\n" "#define RISCV_ABI_XLEN $xlen" >>confdefs.h
 
diff --git a/sysdeps/riscv/tlsdesc.c b/sysdeps/riscv/tlsdesc.c
new file mode 100644
index 0000000000..d013bc7135
--- /dev/null
+++ b/sysdeps/riscv/tlsdesc.c
@@ -0,0 +1,38 @@ 
+/* Manage TLS descriptors.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <ldsodefs.h>
+#include <tls.h>
+#include <dl-tls.h>
+#include <dl-tlsdesc.h>
+#include <dl-unmap-segments.h>
+#include <tlsdeschtab.h>
+
+/* Unmap the dynamic object, but also release its TLS descriptor table
+   if there is one.  */
+
+void
+_dl_unmap (struct link_map *map)
+{
+  _dl_unmap_segments (map);
+
+#ifdef SHARED
+  if (map->l_mach.tlsdesc_table)
+    htab_delete (map->l_mach.tlsdesc_table);
+#endif
+}
diff --git a/sysdeps/riscv/tlsdesc.sym b/sysdeps/riscv/tlsdesc.sym
new file mode 100644
index 0000000000..652e72ea58
--- /dev/null
+++ b/sysdeps/riscv/tlsdesc.sym
@@ -0,0 +1,19 @@ 
+#include <stddef.h>
+#include <sysdep.h>
+#include <tls.h>
+#include <link.h>
+#include <dl-tls.h>
+#include <dl-tlsdesc.h>
+
+--
+
+-- Abuse tls.h macros to derive offsets relative to the thread register.
+
+TLSDESC_ARG		offsetof(struct tlsdesc, arg)
+TLSDESC_GEN_COUNT	offsetof(struct tlsdesc_dynamic_arg, gen_count)
+TLSDESC_MODID		offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_module)
+TLSDESC_MODOFF		offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_offset)
+TCBHEAD_DTV		offsetof(tcbhead_t, dtv) - sizeof(tcbhead_t) - TLS_TCB_OFFSET
+DTV_COUNTER		offsetof(dtv_t, counter)
+TLS_DTV_UNALLOCATED	TLS_DTV_UNALLOCATED
+TLS_DTV_OFFSET		TLS_DTV_OFFSET
diff --git a/sysdeps/riscv/tst-gnu2-tls2.c b/sysdeps/riscv/tst-gnu2-tls2.c
new file mode 100644
index 0000000000..d0b0334eab
--- /dev/null
+++ b/sysdeps/riscv/tst-gnu2-tls2.c
@@ -0,0 +1,33 @@ 
+/* Test TLSDESC relocation.  RISC-V version.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifdef __riscv_vector
+
+/* Clear vector registers. Also clobbers vl and vtype. */
+#define PREPARE_MALLOC()					\
+{								\
+  asm volatile ("vsetvli    zero, zero, e8, m8, ta, ma");	\
+  asm volatile ("vmv.v.i    v0, 0" : : : "v0" );		\
+  asm volatile ("vmv.v.i    v8, 0" : : : "v8" );		\
+  asm volatile ("vmv.v.i    v16, 0" : : : "v16" );		\
+  asm volatile ("vmv.v.i    v24, 0" : : : "v24" );		\
+}
+
+#endif /* __riscv_vector */
+
+#include_next <tst-gnu2-tls2.c>
diff --git a/sysdeps/unix/sysv/linux/riscv/localplt.data b/sysdeps/unix/sysv/linux/riscv/localplt.data
index ea887042e0..01710df22d 100644
--- a/sysdeps/unix/sysv/linux/riscv/localplt.data
+++ b/sysdeps/unix/sysv/linux/riscv/localplt.data
@@ -6,3 +6,5 @@  libc.so: free
 libc.so: malloc
 libc.so: memset ?
 libc.so: realloc
+# The dynamic loader needs __tls_get_addr for TLS.
+ld.so: __tls_get_addr