elf: Add AES-128 implementation for arc4random

Message ID 20180302125302.5CF0F4045458E@oldenburg.str.redhat.com
State Superseded
Headers

Commit Message

Florian Weimer March 2, 2018, 12:53 p.m. UTC
  This commit imports the AES-128 implementation from libgcrypt.

This code has to reside in ld.so because it will be used to
initialize the stack protector cookie and the pointer guard
from the AT_RANDOM variable.

AES-128 was chosen as the cryptographic primitive because hardware
support for AES-128 is much more widespread than for SHA-1 or SHA-256.
This means that we can add hardware acceleration for arc4random for
a larger number of systems, as a subsequent optimization.

2018-03-02  Florian Weimer  <fweimer@redhat.com>

	* LICENSES: Add new notices from libgcrypt.
	* elf/Makefile (dl-routines): Add dl-arc4random.
	(tests-internal): Add tst-dl-arc4random, tst-dl-arc4random-static.
	(tests-static): Add tst-dl-arc4random-static.
	* Versions (GLIBC_PRIVATE): Export _dl_arc4random_schedule,
	_dl_arc4random_block.
	* elf/dl-arc4random.c: New file.
	* elf/tst-dl-arc4random.c: Likewise.
	* elf/tst-dl-arc4random-static.c: Likewise.
	* include/arc4random.h: Likewise.
  

Comments

Joseph Myers March 2, 2018, 5:15 p.m. UTC | #1
On Fri, 2 Mar 2018, Florian Weimer wrote:

> diff --git a/LICENSES b/LICENSES
> index 80f7f14879..dc73a5a198 100644
> --- a/LICENSES
> +++ b/LICENSES
> @@ -467,3 +467,59 @@ Copyright 2001 by Stephen L. Moshier <moshier@na-net.ornl.gov>
>   You should have received a copy of the GNU Lesser General Public
>   License along with this library; if not, see
>   <http://www.gnu.org/licenses/>.  */
> +
> +elf/dl-arc4random.c imports code from libcrypt, with the following
> +notices:

"from libgcrypt" (not libcrypt)?
  
Adhemerval Zanella March 2, 2018, 7:27 p.m. UTC | #2
On 02/03/2018 09:53, Florian Weimer wrote:
> This commit imports the AES-128 implementation from libgcrypt.
> 
> This code has to reside in ld.so because it will be used to
> initialize the stack protector cookie and the pointer guard
> from the AT_RANDOM variable.
> 
> AES-128 was chosen as the cryptographic primitive because hardware
> support for AES-128 is much more widespread than for SHA-1 or SHA-256.
> This means that we can add hardware acceleration for arc4random for
> a larger number of systems, as a subsequent optimization.

I noted other system (*BSD, Linux kernel, etc.) are using ChaCha20 instead
of AES-128 for both arc4random and /dev/{u}random, but I don't have much
information why exactly ChaCha20 was picked instead.  Checking some
discussion why ChaCha20 is preferable [1] it seems is usually faster on 
hardware without specialized instructions and less susceptible to cache
timing attacks. However, cryptoanalysis is not really forte, so I just
curious why we should do something different than others for arc4random. 

[1] https://crypto.stackexchange.com/questions/34455/whats-the-appeal-of-using-chacha20-instead-of-aes

> 
> 2018-03-02  Florian Weimer  <fweimer@redhat.com>
> 
> 	* LICENSES: Add new notices from libgcrypt.
> 	* elf/Makefile (dl-routines): Add dl-arc4random.
> 	(tests-internal): Add tst-dl-arc4random, tst-dl-arc4random-static.
> 	(tests-static): Add tst-dl-arc4random-static.
> 	* Versions (GLIBC_PRIVATE): Export _dl_arc4random_schedule,
> 	_dl_arc4random_block.
> 	* elf/dl-arc4random.c: New file.
> 	* elf/tst-dl-arc4random.c: Likewise.
> 	* elf/tst-dl-arc4random-static.c: Likewise.
> 	* include/arc4random.h: Likewise.
> 
> diff --git a/LICENSES b/LICENSES
> index 80f7f14879..dc73a5a198 100644
> --- a/LICENSES
> +++ b/LICENSES
> @@ -467,3 +467,59 @@ Copyright 2001 by Stephen L. Moshier <moshier@na-net.ornl.gov>
>   You should have received a copy of the GNU Lesser General Public
>   License along with this library; if not, see
>   <http://www.gnu.org/licenses/>.  */
> +
> +elf/dl-arc4random.c imports code from libcrypt, with the following
> +notices:
> +
> +Rijndael (AES) for GnuPG
> +Copyright (C) 2000, 2001, 2002, 2003, 2007,
> +              2008, 2011, 2012 Free Software Foundation, Inc.
> +
> +This file is part of Libgcrypt.
> +
> +Libgcrypt is free software; you can redistribute it and/or modify
> +it under the terms of the GNU Lesser General Public License as
> +published by the Free Software Foundation; either version 2.1 of
> +the License, or (at your option) any later version.
> +
> +Libgcrypt is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU Lesser General Public License for more details.
> +
> +You should have received a copy of the GNU Lesser General Public
> +License along with this program; if not, see <http://www.gnu.org/licenses/>.
> +******************************************************************
> +The code here is based on the optimized implementation taken from
> +http://www.esat.kuleuven.ac.be/~rijmen/rijndael/ on Oct 2, 2000,
> +which carries this notice:
> +------------------------------------------
> +rijndael-alg-fst.c   v2.3   April '2000
> +
> +Optimised ANSI C code
> +
> +authors: v1.0: Antoon Bosselaers
> +         v2.0: Vincent Rijmen
> +         v2.3: Paulo Barreto
> +
> +This code is placed in the public domain.
> +------------------------------------------
> +
> +rijndael-tables.h - Rijndael (AES) for GnuPG,
> +Copyright (C) 2000, 2001, 2002, 2003, 2007,
> +              2008 Free Software Foundation, Inc.
> +
> +This file is part of Libgcrypt.
> +
> +Libgcrypt is free software; you can redistribute it and/or modify
> +it under the terms of the GNU Lesser General Public License as
> +published by the Free Software Foundation; either version 2.1 of
> +the License, or (at your option) any later version.
> +
> +Libgcrypt is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU Lesser General Public License for more details.
> +
> +You should have received a copy of the GNU Lesser General Public
> +License along with this program; if not, see <http://www.gnu.org/licenses/>.
> diff --git a/elf/Makefile b/elf/Makefile
> index 9bdb9220c7..043c0242a4 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -33,7 +33,7 @@ dl-routines	= $(addprefix dl-,load lookup object reloc deps hwcaps \
>  				  runtime init fini debug misc \
>  				  version profile tls origin scope \
>  				  execstack open close trampoline \
> -				  exception sort-maps)
> +				  exception sort-maps arc4random)
>  ifeq (yes,$(use-ldconfig))
>  dl-routines += dl-cache
>  endif
> @@ -394,6 +394,10 @@ ifeq (yesno,$(nss-crypt)$(static-nss-crypt))
>  CFLAGS-tst-linkall-static.c += -UUSE_CRYPT -DUSE_CRYPT=0
>  endif
>  
> +# Test for internal-only PRNG building blocks.
> +tests-internal += tst-dl-arc4random tst-dl-arc4random-static
> +tests-static += tst-dl-arc4random-static
> +
>  include ../Rules
>  
>  ifeq (yes,$(build-shared))
> diff --git a/elf/Versions b/elf/Versions
> index 3b09901f6c..2c3e4bd061 100644
> --- a/elf/Versions
> +++ b/elf/Versions
> @@ -78,5 +78,8 @@ ld {
>  
>      # Set value of a tunable.
>      __tunable_get_val;
> +
> +    # PRNG building blocks.
> +    _dl_arc4random_schedule; _dl_arc4random_block;
>    }
>  }
> diff --git a/elf/dl-arc4random.c b/elf/dl-arc4random.c
> new file mode 100644
> index 0000000000..26ec061764
> --- /dev/null
> +++ b/elf/dl-arc4random.c
> @@ -0,0 +1,401 @@
> +/* Low-level AES-128 routines for the arc4random PRNG.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +
> +#include <arc4random.h>
> +
> +/* Copied from libcrypt.  */
> +static void do_setkey (struct arc4random_global_state *,
> +                       const unsigned char *);
> +static void
> +do_encrypt_fn (const struct arc4random_global_state *state,
> +               struct arc4random_block *b,
> +               const struct arc4random_block *a);
> +static const uint32_t encT[256];
> +static const uint32_t rcon[30];
> +
> +void
> +_dl_arc4random_schedule (struct arc4random_global_state *state,
> +                         const unsigned char *key)
> +{
> +  do_setkey (state, key);
> +}
> +rtld_hidden_def (_dl_arc4random_schedule)
> +
> +void
> +_dl_arc4random_block (const struct arc4random_global_state *state,
> +                      struct arc4random_personalization personalization,
> +                      struct arc4random_block *output)
> +{
> +  union
> +  {
> +    struct arc4random_personalization personalization;
> +    struct arc4random_block block;
> +  } u;
> +  u.personalization = personalization;
> +  do_encrypt_fn (state, output, &u.block);
> +}
> +rtld_hidden_def (_dl_arc4random_block)
> +
> +static inline uint32_t
> +le_bswap32 (uint32_t value)
> +{
> +#if __BYTE_ORDER == __BIG_ENDIAN
> +  return __builtin_bswap32 (value);
> +#elif  __BYTE_ORDER == __LITTLE_ENDIAN
> +  return value;
> +#else
> +# error invalid __BYTE_ORDER
> +#endif
> +}
> +
> +static inline uint32_t
> +rol (uint32_t value, unsigned int shift)
> +{
> +  return (value << (shift & 31)) | (value >> ((32 - shift) & 31));
> +}
> +
> +/* The do_setkey and do_encrypt_fn functions are based on rijndael.c
> +   from libgcrypt 1.8.1, which has the following copyright
> +   information.  */
> +
> +/* Rijndael (AES) for GnuPG
> + * Copyright (C) 2000, 2001, 2002, 2003, 2007,
> + *               2008, 2011, 2012 Free Software Foundation, Inc.
> + *
> + * This file is part of Libgcrypt.
> + *
> + * Libgcrypt is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as
> + * published by the Free Software Foundation; either version 2.1 of
> + * the License, or (at your option) any later version.
> + *
> + * Libgcrypt is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this program; if not, see <http://www.gnu.org/licenses/>.
> + *******************************************************************
> + * The code here is based on the optimized implementation taken from
> + * http://www.esat.kuleuven.ac.be/~rijmen/rijndael/ on Oct 2, 2000,
> + * which carries this notice:
> + *------------------------------------------
> + * rijndael-alg-fst.c   v2.3   April '2000
> + *
> + * Optimised ANSI C code
> + *
> + * authors: v1.0: Antoon Bosselaers
> + *          v2.0: Vincent Rijmen
> + *          v2.3: Paulo Barreto
> + *
> + * This code is placed in the public domain.
> + *------------------------------------------
> + *
> + * The SP800-38a document is available at:
> + *   http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf
> + *
> + */
> +
> +static void
> +do_setkey (struct arc4random_global_state *state, const unsigned char *key)
> +{
> +  enum { KC = 4, keylen = 16 };
> +  const unsigned char *sbox = ((const unsigned char *) encT) + 1;
> +  union
> +  {
> +    unsigned char tk[KC][4];
> +    uint32_t tk_u32[KC];
> +  } u;
> +  int rconpointer = 0;
> +
> +  typedef uint32_t unaligned_uint32_t __attribute__ ((aligned (1)));
> +  const unaligned_uint32_t *key32 = (const unaligned_uint32_t *) key;
> +
> +  for (int j = 0; j < KC; ++j)
> +    u.tk_u32[j] = key32[j];
> +
> +  int r = 0;
> +  int t = 0;
> +  /* Copy values into round key array.  */
> +  for (int j = 0; j < KC && r < arc4random_aes_rounds + 1; )
> +    {
> +      for (; j < KC && t < 4; j++, t++)
> +        state->key_schedule[r][t] = le_bswap32 (u.tk_u32[j]);
> +      if (t == 4)
> +        {
> +          r++;
> +          t = 0;
> +        }
> +    }
> +
> +  while (r < arc4random_aes_rounds + 1)
> +    {
> +      /* While not enough round key material calculated calculate
> +         new values.  */
> +      u.tk[0][0] ^= sbox[u.tk[KC - 1][1] * 4];
> +      u.tk[0][1] ^= sbox[u.tk[KC - 1][2] * 4];
> +      u.tk[0][2] ^= sbox[u.tk[KC - 1][3] * 4];
> +      u.tk[0][3] ^= sbox[u.tk[KC - 1][0] * 4];
> +      u.tk[0][0] ^= rcon[rconpointer++];
> +
> +      for (int j = 1; j < KC; j++)
> +        u.tk_u32[j] ^= u.tk_u32[j - 1];
> +
> +      /* Copy values into round key array.  */
> +      for (int j = 0; j < KC && r < arc4random_aes_rounds + 1; )
> +        {
> +          for (; j < KC && t < 4; j++, t++)
> +            state->key_schedule[r][t] = le_bswap32 (u.tk_u32[j]);
> +          if (t == 4)
> +            {
> +              r++;
> +              t = 0;
> +            }
> +        }
> +    }
> +}
> +
> +static void
> +do_encrypt_fn (const struct arc4random_global_state *state,
> +               struct arc4random_block *b,
> +               const struct arc4random_block *a)
> +{
> +  const unsigned char *sbox = ((const unsigned char *)encT) + 1;
> +  int r;
> +  uint32_t sa[4];
> +  uint32_t sb[4];
> +
> +  sb[0] = le_bswap32 (a->data[0]);
> +  sb[1] = le_bswap32 (a->data[1]);
> +  sb[2] = le_bswap32 (a->data[2]);
> +  sb[3] = le_bswap32 (a->data[3]);
> +
> +  sa[0] = sb[0] ^ state->key_schedule[0][0];
> +  sa[1] = sb[1] ^ state->key_schedule[0][1];
> +  sa[2] = sb[2] ^ state->key_schedule[0][2];
> +  sa[3] = sb[3] ^ state->key_schedule[0][3];
> +
> +  sb[0] = rol (encT[(uint8_t) (sa[0] >> (0 * 8))], (0 * 8));
> +  sb[3] = rol (encT[(uint8_t) (sa[0] >> (1 * 8))], (1 * 8));
> +  sb[2] = rol (encT[(uint8_t) (sa[0] >> (2 * 8))], (2 * 8));
> +  sb[1] = rol (encT[(uint8_t) (sa[0] >> (3 * 8))], (3 * 8));
> +  sa[0] = state->key_schedule[1][0] ^ sb[0];
> +
> +  sb[1] ^= rol (encT[(uint8_t) (sa[1] >> (0 * 8))], (0 * 8));
> +  sa[0] ^= rol (encT[(uint8_t) (sa[1] >> (1 * 8))], (1 * 8));
> +  sb[3] ^= rol (encT[(uint8_t) (sa[1] >> (2 * 8))], (2 * 8));
> +  sb[2] ^= rol (encT[(uint8_t) (sa[1] >> (3 * 8))], (3 * 8));
> +  sa[1] = state->key_schedule[1][1] ^ sb[1];
> +
> +  sb[2] ^= rol (encT[(uint8_t) (sa[2] >> (0 * 8))], (0 * 8));
> +  sa[1] ^= rol (encT[(uint8_t) (sa[2] >> (1 * 8))], (1 * 8));
> +  sa[0] ^= rol (encT[(uint8_t) (sa[2] >> (2 * 8))], (2 * 8));
> +  sb[3] ^= rol (encT[(uint8_t) (sa[2] >> (3 * 8))], (3 * 8));
> +  sa[2] = state->key_schedule[1][2] ^ sb[2];
> +
> +  sb[3] ^= rol (encT[(uint8_t) (sa[3] >> (0 * 8))], (0 * 8));
> +  sa[2] ^= rol (encT[(uint8_t) (sa[3] >> (1 * 8))], (1 * 8));
> +  sa[1] ^= rol (encT[(uint8_t) (sa[3] >> (2 * 8))], (2 * 8));
> +  sa[0] ^= rol (encT[(uint8_t) (sa[3] >> (3 * 8))], (3 * 8));
> +  sa[3] = state->key_schedule[1][3] ^ sb[3];
> +
> +  for (r = 2; r < arc4random_aes_rounds; r++)
> +    {
> +      sb[0] = rol (encT[(uint8_t) (sa[0] >> (0 * 8))], (0 * 8));
> +      sb[3] = rol (encT[(uint8_t) (sa[0] >> (1 * 8))], (1 * 8));
> +      sb[2] = rol (encT[(uint8_t) (sa[0] >> (2 * 8))], (2 * 8));
> +      sb[1] = rol (encT[(uint8_t) (sa[0] >> (3 * 8))], (3 * 8));
> +      sa[0] = state->key_schedule[r][0] ^ sb[0];
> +
> +      sb[1] ^= rol (encT[(uint8_t) (sa[1] >> (0 * 8))], (0 * 8));
> +      sa[0] ^= rol (encT[(uint8_t) (sa[1] >> (1 * 8))], (1 * 8));
> +      sb[3] ^= rol (encT[(uint8_t) (sa[1] >> (2 * 8))], (2 * 8));
> +      sb[2] ^= rol (encT[(uint8_t) (sa[1] >> (3 * 8))], (3 * 8));
> +      sa[1] = state->key_schedule[r][1] ^ sb[1];
> +
> +      sb[2] ^= rol (encT[(uint8_t) (sa[2] >> (0 * 8))], (0 * 8));
> +      sa[1] ^= rol (encT[(uint8_t) (sa[2] >> (1 * 8))], (1 * 8));
> +      sa[0] ^= rol (encT[(uint8_t) (sa[2] >> (2 * 8))], (2 * 8));
> +      sb[3] ^= rol (encT[(uint8_t) (sa[2] >> (3 * 8))], (3 * 8));
> +      sa[2] = state->key_schedule[r][2] ^ sb[2];
> +
> +      sb[3] ^= rol (encT[(uint8_t) (sa[3] >> (0 * 8))], (0 * 8));
> +      sa[2] ^= rol (encT[(uint8_t) (sa[3] >> (1 * 8))], (1 * 8));
> +      sa[1] ^= rol (encT[(uint8_t) (sa[3] >> (2 * 8))], (2 * 8));
> +      sa[0] ^= rol (encT[(uint8_t) (sa[3] >> (3 * 8))], (3 * 8));
> +      sa[3] = state->key_schedule[r][3] ^ sb[3];
> +
> +      r++;
> +
> +      sb[0] = rol (encT[(uint8_t) (sa[0] >> (0 * 8))], (0 * 8));
> +      sb[3] = rol (encT[(uint8_t) (sa[0] >> (1 * 8))], (1 * 8));
> +      sb[2] = rol (encT[(uint8_t) (sa[0] >> (2 * 8))], (2 * 8));
> +      sb[1] = rol (encT[(uint8_t) (sa[0] >> (3 * 8))], (3 * 8));
> +      sa[0] = state->key_schedule[r][0] ^ sb[0];
> +
> +      sb[1] ^= rol (encT[(uint8_t) (sa[1] >> (0 * 8))], (0 * 8));
> +      sa[0] ^= rol (encT[(uint8_t) (sa[1] >> (1 * 8))], (1 * 8));
> +      sb[3] ^= rol (encT[(uint8_t) (sa[1] >> (2 * 8))], (2 * 8));
> +      sb[2] ^= rol (encT[(uint8_t) (sa[1] >> (3 * 8))], (3 * 8));
> +      sa[1] = state->key_schedule[r][1] ^ sb[1];
> +
> +      sb[2] ^= rol (encT[(uint8_t) (sa[2] >> (0 * 8))], (0 * 8));
> +      sa[1] ^= rol (encT[(uint8_t) (sa[2] >> (1 * 8))], (1 * 8));
> +      sa[0] ^= rol (encT[(uint8_t) (sa[2] >> (2 * 8))], (2 * 8));
> +      sb[3] ^= rol (encT[(uint8_t) (sa[2] >> (3 * 8))], (3 * 8));
> +      sa[2] = state->key_schedule[r][2] ^ sb[2];
> +
> +      sb[3] ^= rol (encT[(uint8_t) (sa[3] >> (0 * 8))], (0 * 8));
> +      sa[2] ^= rol (encT[(uint8_t) (sa[3] >> (1 * 8))], (1 * 8));
> +      sa[1] ^= rol (encT[(uint8_t) (sa[3] >> (2 * 8))], (2 * 8));
> +      sa[0] ^= rol (encT[(uint8_t) (sa[3] >> (3 * 8))], (3 * 8));
> +      sa[3] = state->key_schedule[r][3] ^ sb[3];
> +    }
> +
> +  /* Last round is special.  */
> +
> +  sb[0] = (sbox[(uint8_t) (sa[0] >> (0 * 8)) * 4]) << (0 * 8);
> +  sb[3] = (sbox[(uint8_t) (sa[0] >> (1 * 8)) * 4]) << (1 * 8);
> +  sb[2] = (sbox[(uint8_t) (sa[0] >> (2 * 8)) * 4]) << (2 * 8);
> +  sb[1] = (sbox[(uint8_t) (sa[0] >> (3 * 8)) * 4]) << (3 * 8);
> +  sa[0] = state->key_schedule[r][0] ^ sb[0];
> +
> +  sb[1] ^= (sbox[(uint8_t) (sa[1] >> (0 * 8)) * 4]) << (0 * 8);
> +  sa[0] ^= (sbox[(uint8_t) (sa[1] >> (1 * 8)) * 4]) << (1 * 8);
> +  sb[3] ^= (sbox[(uint8_t) (sa[1] >> (2 * 8)) * 4]) << (2 * 8);
> +  sb[2] ^= (sbox[(uint8_t) (sa[1] >> (3 * 8)) * 4]) << (3 * 8);
> +  sa[1] = state->key_schedule[r][1] ^ sb[1];
> +
> +  sb[2] ^= (sbox[(uint8_t) (sa[2] >> (0 * 8)) * 4]) << (0 * 8);
> +  sa[1] ^= (sbox[(uint8_t) (sa[2] >> (1 * 8)) * 4]) << (1 * 8);
> +  sa[0] ^= (sbox[(uint8_t) (sa[2] >> (2 * 8)) * 4]) << (2 * 8);
> +  sb[3] ^= (sbox[(uint8_t) (sa[2] >> (3 * 8)) * 4]) << (3 * 8);
> +  sa[2] = state->key_schedule[r][2] ^ sb[2];
> +
> +  sb[3] ^= (sbox[(uint8_t) (sa[3] >> (0 * 8)) * 4]) << (0 * 8);
> +  sa[2] ^= (sbox[(uint8_t) (sa[3] >> (1 * 8)) * 4]) << (1 * 8);
> +  sa[1] ^= (sbox[(uint8_t) (sa[3] >> (2 * 8)) * 4]) << (2 * 8);
> +  sa[0] ^= (sbox[(uint8_t) (sa[3] >> (3 * 8)) * 4]) << (3 * 8);
> +  sa[3] = state->key_schedule[r][3] ^ sb[3];
> +
> +  b->data[0] = le_bswap32 (sa[0]);
> +  b->data[1] = le_bswap32 (sa[1]);
> +  b->data[2] = le_bswap32 (sa[2]);
> +  b->data[3] = le_bswap32 (sa[3]);
> +}
> +
> +
> +/* The encT table is derived from the rijndael-tables.h file in
> +   libgcrypt 1.8.1.  */
> +
> +/* rijndael-tables.h - Rijndael (AES) for GnuPG,
> + * Copyright (C) 2000, 2001, 2002, 2003, 2007,
> + *               2008 Free Software Foundation, Inc.
> + *
> + * This file is part of Libgcrypt.
> + *
> + * Libgcrypt is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as
> + * published by the Free Software Foundation; either version 2.1 of
> + * the License, or (at your option) any later version.
> + *
> + * Libgcrypt is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +static const uint32_t encT[256] =
> +  {
> +    0xa56363c6, 0x847c7cf8, 0x997777ee, 0x8d7b7bf6,
> +    0x0df2f2ff, 0xbd6b6bd6, 0xb16f6fde, 0x54c5c591,
> +    0x50303060, 0x03010102, 0xa96767ce, 0x7d2b2b56,
> +    0x19fefee7, 0x62d7d7b5, 0xe6abab4d, 0x9a7676ec,
> +    0x45caca8f, 0x9d82821f, 0x40c9c989, 0x877d7dfa,
> +    0x15fafaef, 0xeb5959b2, 0xc947478e, 0x0bf0f0fb,
> +    0xecadad41, 0x67d4d4b3, 0xfda2a25f, 0xeaafaf45,
> +    0xbf9c9c23, 0xf7a4a453, 0x967272e4, 0x5bc0c09b,
> +    0xc2b7b775, 0x1cfdfde1, 0xae93933d, 0x6a26264c,
> +    0x5a36366c, 0x413f3f7e, 0x02f7f7f5, 0x4fcccc83,
> +    0x5c343468, 0xf4a5a551, 0x34e5e5d1, 0x08f1f1f9,
> +    0x937171e2, 0x73d8d8ab, 0x53313162, 0x3f15152a,
> +    0x0c040408, 0x52c7c795, 0x65232346, 0x5ec3c39d,
> +    0x28181830, 0xa1969637, 0x0f05050a, 0xb59a9a2f,
> +    0x0907070e, 0x36121224, 0x9b80801b, 0x3de2e2df,
> +    0x26ebebcd, 0x6927274e, 0xcdb2b27f, 0x9f7575ea,
> +    0x1b090912, 0x9e83831d, 0x742c2c58, 0x2e1a1a34,
> +    0x2d1b1b36, 0xb26e6edc, 0xee5a5ab4, 0xfba0a05b,
> +    0xf65252a4, 0x4d3b3b76, 0x61d6d6b7, 0xceb3b37d,
> +    0x7b292952, 0x3ee3e3dd, 0x712f2f5e, 0x97848413,
> +    0xf55353a6, 0x68d1d1b9, 0x00000000, 0x2cededc1,
> +    0x60202040, 0x1ffcfce3, 0xc8b1b179, 0xed5b5bb6,
> +    0xbe6a6ad4, 0x46cbcb8d, 0xd9bebe67, 0x4b393972,
> +    0xde4a4a94, 0xd44c4c98, 0xe85858b0, 0x4acfcf85,
> +    0x6bd0d0bb, 0x2aefefc5, 0xe5aaaa4f, 0x16fbfbed,
> +    0xc5434386, 0xd74d4d9a, 0x55333366, 0x94858511,
> +    0xcf45458a, 0x10f9f9e9, 0x06020204, 0x817f7ffe,
> +    0xf05050a0, 0x443c3c78, 0xba9f9f25, 0xe3a8a84b,
> +    0xf35151a2, 0xfea3a35d, 0xc0404080, 0x8a8f8f05,
> +    0xad92923f, 0xbc9d9d21, 0x48383870, 0x04f5f5f1,
> +    0xdfbcbc63, 0xc1b6b677, 0x75dadaaf, 0x63212142,
> +    0x30101020, 0x1affffe5, 0x0ef3f3fd, 0x6dd2d2bf,
> +    0x4ccdcd81, 0x140c0c18, 0x35131326, 0x2fececc3,
> +    0xe15f5fbe, 0xa2979735, 0xcc444488, 0x3917172e,
> +    0x57c4c493, 0xf2a7a755, 0x827e7efc, 0x473d3d7a,
> +    0xac6464c8, 0xe75d5dba, 0x2b191932, 0x957373e6,
> +    0xa06060c0, 0x98818119, 0xd14f4f9e, 0x7fdcdca3,
> +    0x66222244, 0x7e2a2a54, 0xab90903b, 0x8388880b,
> +    0xca46468c, 0x29eeeec7, 0xd3b8b86b, 0x3c141428,
> +    0x79dedea7, 0xe25e5ebc, 0x1d0b0b16, 0x76dbdbad,
> +    0x3be0e0db, 0x56323264, 0x4e3a3a74, 0x1e0a0a14,
> +    0xdb494992, 0x0a06060c, 0x6c242448, 0xe45c5cb8,
> +    0x5dc2c29f, 0x6ed3d3bd, 0xefacac43, 0xa66262c4,
> +    0xa8919139, 0xa4959531, 0x37e4e4d3, 0x8b7979f2,
> +    0x32e7e7d5, 0x43c8c88b, 0x5937376e, 0xb76d6dda,
> +    0x8c8d8d01, 0x64d5d5b1, 0xd24e4e9c, 0xe0a9a949,
> +    0xb46c6cd8, 0xfa5656ac, 0x07f4f4f3, 0x25eaeacf,
> +    0xaf6565ca, 0x8e7a7af4, 0xe9aeae47, 0x18080810,
> +    0xd5baba6f, 0x887878f0, 0x6f25254a, 0x722e2e5c,
> +    0x241c1c38, 0xf1a6a657, 0xc7b4b473, 0x51c6c697,
> +    0x23e8e8cb, 0x7cdddda1, 0x9c7474e8, 0x211f1f3e,
> +    0xdd4b4b96, 0xdcbdbd61, 0x868b8b0d, 0x858a8a0f,
> +    0x907070e0, 0x423e3e7c, 0xc4b5b571, 0xaa6666cc,
> +    0xd8484890, 0x05030306, 0x01f6f6f7, 0x120e0e1c,
> +    0xa36161c2, 0x5f35356a, 0xf95757ae, 0xd0b9b969,
> +    0x91868617, 0x58c1c199, 0x271d1d3a, 0xb99e9e27,
> +    0x38e1e1d9, 0x13f8f8eb, 0xb398982b, 0x33111122,
> +    0xbb6969d2, 0x70d9d9a9, 0x898e8e07, 0xa7949433,
> +    0xb69b9b2d, 0x221e1e3c, 0x92878715, 0x20e9e9c9,
> +    0x49cece87, 0xff5555aa, 0x78282850, 0x7adfdfa5,
> +    0x8f8c8c03, 0xf8a1a159, 0x80898909, 0x170d0d1a,
> +    0xdabfbf65, 0x31e6e6d7, 0xc6424284, 0xb86868d0,
> +    0xc3414182, 0xb0999929, 0x772d2d5a, 0x110f0f1e,
> +    0xcbb0b07b, 0xfc5454a8, 0xd6bbbb6d, 0x3a16162c
> +  };
> +
> +static const uint32_t rcon[30] =
> +  {
> +    0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36, 0x6c,
> +    0xd8, 0xab, 0x4d, 0x9a, 0x2f, 0x5e, 0xbc, 0x63, 0xc6, 0x97, 0x35,
> +    0x6a, 0xd4, 0xb3, 0x7d, 0xfa, 0xef, 0xc5, 0x91
> +  };
> diff --git a/elf/tst-dl-arc4random-static.c b/elf/tst-dl-arc4random-static.c
> new file mode 100644
> index 0000000000..d66b342a06
> --- /dev/null
> +++ b/elf/tst-dl-arc4random-static.c
> @@ -0,0 +1,19 @@
> +/* Test for low-level AES-128 routines for the arc4random PRNG, static version.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include "tst-dl-arc4random.c"
> diff --git a/elf/tst-dl-arc4random.c b/elf/tst-dl-arc4random.c
> new file mode 100644
> index 0000000000..fbdb2a1b0f
> --- /dev/null
> +++ b/elf/tst-dl-arc4random.c
> @@ -0,0 +1,48 @@
> +/* Test for low-level AES-128 routines for the arc4random PRNG.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <arc4random.h>
> +#include <string.h>
> +#include <support/check.h>
> +
> +static int
> +do_test (void)
> +{
> +  unsigned char at_random[16]
> +    = { 1, 2, 3, 4, 5, 6, 7, 8,
> +        131, 132, 133, 134, 135, 136, 137, 138};
> +
> +  struct arc4random_global_state state;
> +  memset (&state, 0, sizeof (state));
> +  _dl_arc4random_schedule (&state, at_random);
> +
> +  struct arc4random_personalization personalization;
> +  _Static_assert (sizeof (personalization) == 16, "sizeof (personalization)");
> +  memcpy (&personalization,
> +          "\xf7\xf6\xf5\xf4\xf3\xf2\xf1\xf0"
> +          "\x00\x01\x02\x03\x04\x05\x06\x07",
> +          sizeof (personalization));
> +  struct arc4random_block result;
> +  _dl_arc4random_block (&state, personalization, &result);
> +
> +  TEST_COMPARE_BLOB (&result, sizeof (result),
> +                     "\xa8\x0b\xa2\x8a\xdd\x9ew\\\000aK\xdc/\xa7\xd5\x16", 16);
> +  return 0;
> +}
> +
> +#include <support/test-driver.c>
> diff --git a/include/arc4random.h b/include/arc4random.h
> new file mode 100644
> index 0000000000..d7f48f4e4f
> --- /dev/null
> +++ b/include/arc4random.h
> @@ -0,0 +1,99 @@
> +/* Internal definitions for the arc4random PRNG.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +/* The random number generator uses AES-128 to encrypt a pair of
> +   64-bit numbers, a personalization number and a block count.
> +
> +   Without reliable fork detection, the global state (in struct
> +   arc4random_global_state) is mapped with the MAP_SHARED flag, so
> +   that it is shared across fork.  The global state contains the AES
> +   key schedule (derived from the 16-byte secret key used for AES-128
> +   encrypted), and the next block number to use.
> +
> +   The per-thread state in struct arc4random_thread_state contains
> +   cached, yet unused results of an AES operation, and information
> +   required to validate that the cache is still valid.  Due to lack of
> +   fork detection, any access to the cache needs to advance the global
> +   block counter (and check that no other thread or process has
> +   advanced it).
> +
> +   Once we have reliabe fork detection, a truly thread-local cache is
> +   possible.  */
> +
> +#ifndef _ARC4RANDOM_H
> +#define _ARC4RANDOM_H
> +
> +#include <atomic.h>
> +#include <stdint.h>
> +
> +/* AES-128 is specified to use 10 rounds.  */
> +enum { arc4random_aes_rounds = 10 };
> +
> +/* AES-128 produces output blocks of 16 bytes.  */
> +enum { arc4random_block_size = 16 };
> +
> +struct arc4random_block
> +{
> +  uint32_t data[4];
> +};
> +
> +/* The data which is encrypted using AES-128.  The personalization
> +   value must come from struct arc4random_global_state.  */
> +struct arc4random_personalization
> +{
> +  /* Unique number assigned to this thread.  Note that the ID is *not*
> +     necessarily unique across threads in different processes.
> +     Therefore, it is still necessary to ensure divergence of the
> +     random bit streams by other means.  ID 0 is reserved for the
> +     dynamic linker itself.  */
> +  uint64_t thread_id;
> +
> +  /* The block number within a single thread.  This is advanced each
> +     time a new block of randomness is obtained.  */
> +  uint64_t block_number;
> +};
> +_Static_assert (sizeof (struct arc4random_personalization)
> +                == arc4random_block_size,
> +                "personalization size matches AES-128 block size");
> +
> +/* The global random number generator state.  This is located on a
> +   page allocated with MAP_SHARED, so that it is inherited across
> +   fork.  rtld_global contains a pointer to such an object.  */
> +struct arc4random_global_state
> +{
> +  /* AES key schedule.  Show the alignment to the compiler.  (In
> +     practice, this struct is allocated at a page-aligned
> +     address.)  */
> +  uint32_t key_schedule[arc4random_aes_rounds + 1][4]
> +    __attribute__ ((aligned (128)));
> +};
> +
> +/* Initializes the AES-128 key schedule.  KEY must point to 16 bytes
> +   of key material.  Exported from ld.so for use with reseeding.  */
> +void _dl_arc4random_schedule (struct arc4random_global_state *,
> +                              const unsigned char *key);
> +rtld_hidden_proto (_dl_arc4random_schedule)
> +
> +/* Computes one block of random data.  This function is exported from
> +   ld.so for use within libc.so.  */
> +void _dl_arc4random_block (const struct arc4random_global_state *,
> +                           const struct arc4random_personalization,
> +                           struct arc4random_block *output);
> +rtld_hidden_proto (_dl_arc4random_block)
> +
> +#endif /* _ARC4RANDOM_H */
>
  
Florian Weimer March 5, 2018, 2:48 p.m. UTC | #3
On 03/02/2018 08:27 PM, Adhemerval Zanella wrote:
> 
> 
> On 02/03/2018 09:53, Florian Weimer wrote:
>> This commit imports the AES-128 implementation from libgcrypt.
>>
>> This code has to reside in ld.so because it will be used to
>> initialize the stack protector cookie and the pointer guard
>> from the AT_RANDOM variable.
>>
>> AES-128 was chosen as the cryptographic primitive because hardware
>> support for AES-128 is much more widespread than for SHA-1 or SHA-256.
>> This means that we can add hardware acceleration for arc4random for
>> a larger number of systems, as a subsequent optimization.
> 
> I noted other system (*BSD, Linux kernel, etc.) are using ChaCha20 instead
> of AES-128 for both arc4random and /dev/{u}random, but I don't have much

FreeBSD still uses RC4 for arc4random for some reason.

> information why exactly ChaCha20 was picked instead.  Checking some
> discussion why ChaCha20 is preferable [1] it seems is usually faster on
> hardware without specialized instructions and less susceptible to cache
> timing attacks. However, cryptoanalysis is not really forte, so I just
> curious why we should do something different than others for arc4random.
> 
> [1] https://crypto.stackexchange.com/questions/34455/whats-the-appeal-of-using-chacha20-instead-of-aes

The advantage of ChaCha20 is that the key schedule is very cheap.  This 
means that you can feed back the output from the generator and use it 
for the next block (this is called “backtracking resistance”).  It's not 
really advisable to do this with a software implementation of AES-128 
for performance reasons.

The downside is that this   Xₙ := fⁿ(X₀) construction risks into running 
a small(ish) cycle after many blocks.  Therefore, the implementation in 
libbsd feeds back not just the 256 key bits, but also 64 bits for the 
initial vector, probably hoping that the 320 bits make it sufficiently 
unlikely that the initial run until the first repeated block is shorter 
than 2**64 iterations or so.

If you encrypt a counter using AES-128, you do not have this problem 
because the encrypted blocks are all distinct, but this means there is a 
generic discriminator because expected block repeats after 2**64 or so 
blocks (due to the birthday paradox) simply do not happen.

Another advantage of encrypted counter approach is that you have very 
little per-thread state.  Basically just the counter and a key stream 
discriminator (see pthread_thread_number_np), although for some 
coprocessor implementations, it may be beneficial to create more than a 
single block per iteration.

The backtracking protection in libbsd still looks somewhat expensive, so 
libbsd generates 1024 output bytes for each feedback step.  40 bytes are 
fed back, the rest is returned to the application piece by piece.  This 
buffer really has to be thread-local if we want an implementation which 
scales, and using 1024 bytes for this seems to be a bit over top.  We 
could probably do with fewer bytes than that (40 + X), but it will 
substantially reduce generator throughput.

Performance-wise, on current Intel CPUs with AES support, the AES-128 
encrypted counter approach will provide a throughput of around 3 
gigabyte per second, with 80 bytes of per-thread state.  I expect that 
the ChaCha20 approach in libbsd will reach this level of performance 
only with a per-thread large buffer, such as the 1064 bytes used in 
libbsd, which should give around 2.75 gigabyte per second.  With 396 
bytes of per-thread state, the predicted performance is 1.1 gigabyte per 
second, and with 104 bytes, it is 0.24 gigabyte per second.

The nominal security strength of ChaCha20 is higher than that of AES-128 
(256 bits vs less than 128 bits), but this is for the cipher itself, not 
for the generators derived from it.  I'm not aware of any reviews of the 
actual generators.

So if we want backtracking protection, we'd probably have to go with the 
396-byte ChaCha20 approach (maybe after recovering the TLS space 
occupied by _res).  Otherwise, AES-128 will be the better choice for a 
lot of users (who have access to hardware with AES-128 acceleration).

Unfortunately, maintaining both approaches has quite a bit of overhead 
because they are so different.

Thanks,
Florian
  
Adhemerval Zanella March 5, 2018, 7:10 p.m. UTC | #4
On 05/03/2018 11:48, Florian Weimer wrote:
> On 03/02/2018 08:27 PM, Adhemerval Zanella wrote:
>>
>>
>> On 02/03/2018 09:53, Florian Weimer wrote:
>>> This commit imports the AES-128 implementation from libgcrypt.
>>>
>>> This code has to reside in ld.so because it will be used to
>>> initialize the stack protector cookie and the pointer guard
>>> from the AT_RANDOM variable.
>>>
>>> AES-128 was chosen as the cryptographic primitive because hardware
>>> support for AES-128 is much more widespread than for SHA-1 or SHA-256.
>>> This means that we can add hardware acceleration for arc4random for
>>> a larger number of systems, as a subsequent optimization.
>>
>> I noted other system (*BSD, Linux kernel, etc.) are using ChaCha20 instead
>> of AES-128 for both arc4random and /dev/{u}random, but I don't have much
> 
> FreeBSD still uses RC4 for arc4random for some reason.
> 

Only for userspace still and I think what it is holding it up ChaCha20 
adoption is how to handle the state after fork [1]. Kernel interface uses
ChaCha20 [2].

[1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=182610
[2] https://github.com/freebsd/freebsd/blob/bd86cc8ebf468975220088041fbe0b3be5649310/sys/libkern/arc4random.c

>> information why exactly ChaCha20 was picked instead.  Checking some
>> discussion why ChaCha20 is preferable [1] it seems is usually faster on
>> hardware without specialized instructions and less susceptible to cache
>> timing attacks. However, cryptoanalysis is not really forte, so I just
>> curious why we should do something different than others for arc4random.
>>
>> [1] https://crypto.stackexchange.com/questions/34455/whats-the-appeal-of-using-chacha20-instead-of-aes
> 
> The advantage of ChaCha20 is that the key schedule is very cheap.  This means that you can feed back the output from the generator and use it for the next block (this is called “backtracking resistance”).  It's not really advisable to do this with a software implementation of AES-128 for performance reasons.
> 
> The downside is that this   Xₙ := fⁿ(X₀) construction risks into running a small(ish) cycle after many blocks.  Therefore, the implementation in libbsd feeds back not just the 256 key bits, but also 64 bits for the initial vector, probably hoping that the 320 bits make it sufficiently unlikely that the initial run until the first repeated block is shorter than 2**64 iterations or so.
> 
> If you encrypt a counter using AES-128, you do not have this problem because the encrypted blocks are all distinct, but this means there is a generic discriminator because expected block repeats after 2**64 or so blocks (due to the birthday paradox) simply do not happen.
> 
> Another advantage of encrypted counter approach is that you have very little per-thread state.  Basically just the counter and a key stream discriminator (see pthread_thread_number_np), although for some coprocessor implementations, it may be beneficial to create more than a single block per iteration.

Right thanks for the thoughtfully explanation, this seems a winner factor 
comparing to chachar20 one.

> 
> The backtracking protection in libbsd still looks somewhat expensive, so libbsd generates 1024 output bytes for each feedback step.  40 bytes are fed back, the rest is returned to the application piece by piece.  This buffer really has to be thread-local if we want an implementation which scales, and using 1024 bytes for this seems to be a bit over top.  We could probably do with fewer bytes than that (40 + X), but it will substantially reduce generator throughput.

Linux implementation seems to be doing the later, as least on 'extract_crng_user'
function by feeding 'crng_backtrack_protect' with at most 64 bytes (minus the 
extracted on for user).  I am not sure if this is suffice for backtracking 
protection, but it seems feasible for a per-thread state.

> 
> Performance-wise, on current Intel CPUs with AES support, the AES-128 encrypted counter approach will provide a throughput of around 3 gigabyte per second, with 80 bytes of per-thread state.  I expect that the ChaCha20 approach in libbsd will reach this level of performance only with a per-thread large buffer, such as the 1064 bytes used in libbsd, which should give around 2.75 gigabyte per second.  With 396 bytes of per-thread state, the predicted performance is 1.1 gigabyte per second, and with 104 bytes, it is 0.24 gigabyte per second.

My main concern is to select a algorithm because we can get improved numbers on
certain platform while on other (which does not have hardware acceleration) a
different approach would be more suitable both in performance and security.

However as you noted, if the idea is to add scalability I think per-thread state
should be considered as well and looks like AES 128 in software mode should be
good enough (and with my qsort refactor patches and recent thread about O(n^2)
worst case being a security issue I think it would be good to have fast per
thread rng). Do you have any numbers for this software implementation?

> 
> The nominal security strength of ChaCha20 is higher than that of AES-128 (256 bits vs less than 128 bits), but this is for the cipher itself, not for the generators derived from it.  I'm not aware of any reviews of the actual generators.
> 
> So if we want backtracking protection, we'd probably have to go with the 396-byte ChaCha20 approach (maybe after recovering the TLS space occupied by _res).  Otherwise, AES-128 will be the better choice for a lot of users (who have access to hardware with AES-128 acceleration).
> 
> Unfortunately, maintaining both approaches has quite a bit of overhead because they are so different.

I agree and I think we should avoid it as well.
  
Florian Weimer May 14, 2018, 3:35 p.m. UTC | #5
On 03/02/2018 06:15 PM, Joseph Myers wrote:
> On Fri, 2 Mar 2018, Florian Weimer wrote:
> 
>> diff --git a/LICENSES b/LICENSES
>> index 80f7f14879..dc73a5a198 100644
>> --- a/LICENSES
>> +++ b/LICENSES
>> @@ -467,3 +467,59 @@ Copyright 2001 by Stephen L. Moshier <moshier@na-net.ornl.gov>
>>    You should have received a copy of the GNU Lesser General Public
>>    License along with this library; if not, see
>>    <http://www.gnu.org/licenses/>.  */
>> +
>> +elf/dl-arc4random.c imports code from libcrypt, with the following
>> +notices:
> 
> "from libgcrypt" (not libcrypt)?

Thanks, fixed locally.

Florian
  

Patch

diff --git a/LICENSES b/LICENSES
index 80f7f14879..dc73a5a198 100644
--- a/LICENSES
+++ b/LICENSES
@@ -467,3 +467,59 @@  Copyright 2001 by Stephen L. Moshier <moshier@na-net.ornl.gov>
  You should have received a copy of the GNU Lesser General Public
  License along with this library; if not, see
  <http://www.gnu.org/licenses/>.  */
+
+elf/dl-arc4random.c imports code from libcrypt, with the following
+notices:
+
+Rijndael (AES) for GnuPG
+Copyright (C) 2000, 2001, 2002, 2003, 2007,
+              2008, 2011, 2012 Free Software Foundation, Inc.
+
+This file is part of Libgcrypt.
+
+Libgcrypt is free software; you can redistribute it and/or modify
+it under the terms of the GNU Lesser General Public License as
+published by the Free Software Foundation; either version 2.1 of
+the License, or (at your option) any later version.
+
+Libgcrypt is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU Lesser General Public License for more details.
+
+You should have received a copy of the GNU Lesser General Public
+License along with this program; if not, see <http://www.gnu.org/licenses/>.
+******************************************************************
+The code here is based on the optimized implementation taken from
+http://www.esat.kuleuven.ac.be/~rijmen/rijndael/ on Oct 2, 2000,
+which carries this notice:
+------------------------------------------
+rijndael-alg-fst.c   v2.3   April '2000
+
+Optimised ANSI C code
+
+authors: v1.0: Antoon Bosselaers
+         v2.0: Vincent Rijmen
+         v2.3: Paulo Barreto
+
+This code is placed in the public domain.
+------------------------------------------
+
+rijndael-tables.h - Rijndael (AES) for GnuPG,
+Copyright (C) 2000, 2001, 2002, 2003, 2007,
+              2008 Free Software Foundation, Inc.
+
+This file is part of Libgcrypt.
+
+Libgcrypt is free software; you can redistribute it and/or modify
+it under the terms of the GNU Lesser General Public License as
+published by the Free Software Foundation; either version 2.1 of
+the License, or (at your option) any later version.
+
+Libgcrypt is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU Lesser General Public License for more details.
+
+You should have received a copy of the GNU Lesser General Public
+License along with this program; if not, see <http://www.gnu.org/licenses/>.
diff --git a/elf/Makefile b/elf/Makefile
index 9bdb9220c7..043c0242a4 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -33,7 +33,7 @@  dl-routines	= $(addprefix dl-,load lookup object reloc deps hwcaps \
 				  runtime init fini debug misc \
 				  version profile tls origin scope \
 				  execstack open close trampoline \
-				  exception sort-maps)
+				  exception sort-maps arc4random)
 ifeq (yes,$(use-ldconfig))
 dl-routines += dl-cache
 endif
@@ -394,6 +394,10 @@  ifeq (yesno,$(nss-crypt)$(static-nss-crypt))
 CFLAGS-tst-linkall-static.c += -UUSE_CRYPT -DUSE_CRYPT=0
 endif
 
+# Test for internal-only PRNG building blocks.
+tests-internal += tst-dl-arc4random tst-dl-arc4random-static
+tests-static += tst-dl-arc4random-static
+
 include ../Rules
 
 ifeq (yes,$(build-shared))
diff --git a/elf/Versions b/elf/Versions
index 3b09901f6c..2c3e4bd061 100644
--- a/elf/Versions
+++ b/elf/Versions
@@ -78,5 +78,8 @@  ld {
 
     # Set value of a tunable.
     __tunable_get_val;
+
+    # PRNG building blocks.
+    _dl_arc4random_schedule; _dl_arc4random_block;
   }
 }
diff --git a/elf/dl-arc4random.c b/elf/dl-arc4random.c
new file mode 100644
index 0000000000..26ec061764
--- /dev/null
+++ b/elf/dl-arc4random.c
@@ -0,0 +1,401 @@ 
+/* Low-level AES-128 routines for the arc4random PRNG.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+
+#include <arc4random.h>
+
+/* Copied from libcrypt.  */
+static void do_setkey (struct arc4random_global_state *,
+                       const unsigned char *);
+static void
+do_encrypt_fn (const struct arc4random_global_state *state,
+               struct arc4random_block *b,
+               const struct arc4random_block *a);
+static const uint32_t encT[256];
+static const uint32_t rcon[30];
+
+void
+_dl_arc4random_schedule (struct arc4random_global_state *state,
+                         const unsigned char *key)
+{
+  do_setkey (state, key);
+}
+rtld_hidden_def (_dl_arc4random_schedule)
+
+void
+_dl_arc4random_block (const struct arc4random_global_state *state,
+                      struct arc4random_personalization personalization,
+                      struct arc4random_block *output)
+{
+  union
+  {
+    struct arc4random_personalization personalization;
+    struct arc4random_block block;
+  } u;
+  u.personalization = personalization;
+  do_encrypt_fn (state, output, &u.block);
+}
+rtld_hidden_def (_dl_arc4random_block)
+
+static inline uint32_t
+le_bswap32 (uint32_t value)
+{
+#if __BYTE_ORDER == __BIG_ENDIAN
+  return __builtin_bswap32 (value);
+#elif  __BYTE_ORDER == __LITTLE_ENDIAN
+  return value;
+#else
+# error invalid __BYTE_ORDER
+#endif
+}
+
+static inline uint32_t
+rol (uint32_t value, unsigned int shift)
+{
+  return (value << (shift & 31)) | (value >> ((32 - shift) & 31));
+}
+
+/* The do_setkey and do_encrypt_fn functions are based on rijndael.c
+   from libgcrypt 1.8.1, which has the following copyright
+   information.  */
+
+/* Rijndael (AES) for GnuPG
+ * Copyright (C) 2000, 2001, 2002, 2003, 2007,
+ *               2008, 2011, 2012 Free Software Foundation, Inc.
+ *
+ * This file is part of Libgcrypt.
+ *
+ * Libgcrypt is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as
+ * published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * Libgcrypt is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this program; if not, see <http://www.gnu.org/licenses/>.
+ *******************************************************************
+ * The code here is based on the optimized implementation taken from
+ * http://www.esat.kuleuven.ac.be/~rijmen/rijndael/ on Oct 2, 2000,
+ * which carries this notice:
+ *------------------------------------------
+ * rijndael-alg-fst.c   v2.3   April '2000
+ *
+ * Optimised ANSI C code
+ *
+ * authors: v1.0: Antoon Bosselaers
+ *          v2.0: Vincent Rijmen
+ *          v2.3: Paulo Barreto
+ *
+ * This code is placed in the public domain.
+ *------------------------------------------
+ *
+ * The SP800-38a document is available at:
+ *   http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf
+ *
+ */
+
+static void
+do_setkey (struct arc4random_global_state *state, const unsigned char *key)
+{
+  enum { KC = 4, keylen = 16 };
+  const unsigned char *sbox = ((const unsigned char *) encT) + 1;
+  union
+  {
+    unsigned char tk[KC][4];
+    uint32_t tk_u32[KC];
+  } u;
+  int rconpointer = 0;
+
+  typedef uint32_t unaligned_uint32_t __attribute__ ((aligned (1)));
+  const unaligned_uint32_t *key32 = (const unaligned_uint32_t *) key;
+
+  for (int j = 0; j < KC; ++j)
+    u.tk_u32[j] = key32[j];
+
+  int r = 0;
+  int t = 0;
+  /* Copy values into round key array.  */
+  for (int j = 0; j < KC && r < arc4random_aes_rounds + 1; )
+    {
+      for (; j < KC && t < 4; j++, t++)
+        state->key_schedule[r][t] = le_bswap32 (u.tk_u32[j]);
+      if (t == 4)
+        {
+          r++;
+          t = 0;
+        }
+    }
+
+  while (r < arc4random_aes_rounds + 1)
+    {
+      /* While not enough round key material calculated calculate
+         new values.  */
+      u.tk[0][0] ^= sbox[u.tk[KC - 1][1] * 4];
+      u.tk[0][1] ^= sbox[u.tk[KC - 1][2] * 4];
+      u.tk[0][2] ^= sbox[u.tk[KC - 1][3] * 4];
+      u.tk[0][3] ^= sbox[u.tk[KC - 1][0] * 4];
+      u.tk[0][0] ^= rcon[rconpointer++];
+
+      for (int j = 1; j < KC; j++)
+        u.tk_u32[j] ^= u.tk_u32[j - 1];
+
+      /* Copy values into round key array.  */
+      for (int j = 0; j < KC && r < arc4random_aes_rounds + 1; )
+        {
+          for (; j < KC && t < 4; j++, t++)
+            state->key_schedule[r][t] = le_bswap32 (u.tk_u32[j]);
+          if (t == 4)
+            {
+              r++;
+              t = 0;
+            }
+        }
+    }
+}
+
+static void
+do_encrypt_fn (const struct arc4random_global_state *state,
+               struct arc4random_block *b,
+               const struct arc4random_block *a)
+{
+  const unsigned char *sbox = ((const unsigned char *)encT) + 1;
+  int r;
+  uint32_t sa[4];
+  uint32_t sb[4];
+
+  sb[0] = le_bswap32 (a->data[0]);
+  sb[1] = le_bswap32 (a->data[1]);
+  sb[2] = le_bswap32 (a->data[2]);
+  sb[3] = le_bswap32 (a->data[3]);
+
+  sa[0] = sb[0] ^ state->key_schedule[0][0];
+  sa[1] = sb[1] ^ state->key_schedule[0][1];
+  sa[2] = sb[2] ^ state->key_schedule[0][2];
+  sa[3] = sb[3] ^ state->key_schedule[0][3];
+
+  sb[0] = rol (encT[(uint8_t) (sa[0] >> (0 * 8))], (0 * 8));
+  sb[3] = rol (encT[(uint8_t) (sa[0] >> (1 * 8))], (1 * 8));
+  sb[2] = rol (encT[(uint8_t) (sa[0] >> (2 * 8))], (2 * 8));
+  sb[1] = rol (encT[(uint8_t) (sa[0] >> (3 * 8))], (3 * 8));
+  sa[0] = state->key_schedule[1][0] ^ sb[0];
+
+  sb[1] ^= rol (encT[(uint8_t) (sa[1] >> (0 * 8))], (0 * 8));
+  sa[0] ^= rol (encT[(uint8_t) (sa[1] >> (1 * 8))], (1 * 8));
+  sb[3] ^= rol (encT[(uint8_t) (sa[1] >> (2 * 8))], (2 * 8));
+  sb[2] ^= rol (encT[(uint8_t) (sa[1] >> (3 * 8))], (3 * 8));
+  sa[1] = state->key_schedule[1][1] ^ sb[1];
+
+  sb[2] ^= rol (encT[(uint8_t) (sa[2] >> (0 * 8))], (0 * 8));
+  sa[1] ^= rol (encT[(uint8_t) (sa[2] >> (1 * 8))], (1 * 8));
+  sa[0] ^= rol (encT[(uint8_t) (sa[2] >> (2 * 8))], (2 * 8));
+  sb[3] ^= rol (encT[(uint8_t) (sa[2] >> (3 * 8))], (3 * 8));
+  sa[2] = state->key_schedule[1][2] ^ sb[2];
+
+  sb[3] ^= rol (encT[(uint8_t) (sa[3] >> (0 * 8))], (0 * 8));
+  sa[2] ^= rol (encT[(uint8_t) (sa[3] >> (1 * 8))], (1 * 8));
+  sa[1] ^= rol (encT[(uint8_t) (sa[3] >> (2 * 8))], (2 * 8));
+  sa[0] ^= rol (encT[(uint8_t) (sa[3] >> (3 * 8))], (3 * 8));
+  sa[3] = state->key_schedule[1][3] ^ sb[3];
+
+  for (r = 2; r < arc4random_aes_rounds; r++)
+    {
+      sb[0] = rol (encT[(uint8_t) (sa[0] >> (0 * 8))], (0 * 8));
+      sb[3] = rol (encT[(uint8_t) (sa[0] >> (1 * 8))], (1 * 8));
+      sb[2] = rol (encT[(uint8_t) (sa[0] >> (2 * 8))], (2 * 8));
+      sb[1] = rol (encT[(uint8_t) (sa[0] >> (3 * 8))], (3 * 8));
+      sa[0] = state->key_schedule[r][0] ^ sb[0];
+
+      sb[1] ^= rol (encT[(uint8_t) (sa[1] >> (0 * 8))], (0 * 8));
+      sa[0] ^= rol (encT[(uint8_t) (sa[1] >> (1 * 8))], (1 * 8));
+      sb[3] ^= rol (encT[(uint8_t) (sa[1] >> (2 * 8))], (2 * 8));
+      sb[2] ^= rol (encT[(uint8_t) (sa[1] >> (3 * 8))], (3 * 8));
+      sa[1] = state->key_schedule[r][1] ^ sb[1];
+
+      sb[2] ^= rol (encT[(uint8_t) (sa[2] >> (0 * 8))], (0 * 8));
+      sa[1] ^= rol (encT[(uint8_t) (sa[2] >> (1 * 8))], (1 * 8));
+      sa[0] ^= rol (encT[(uint8_t) (sa[2] >> (2 * 8))], (2 * 8));
+      sb[3] ^= rol (encT[(uint8_t) (sa[2] >> (3 * 8))], (3 * 8));
+      sa[2] = state->key_schedule[r][2] ^ sb[2];
+
+      sb[3] ^= rol (encT[(uint8_t) (sa[3] >> (0 * 8))], (0 * 8));
+      sa[2] ^= rol (encT[(uint8_t) (sa[3] >> (1 * 8))], (1 * 8));
+      sa[1] ^= rol (encT[(uint8_t) (sa[3] >> (2 * 8))], (2 * 8));
+      sa[0] ^= rol (encT[(uint8_t) (sa[3] >> (3 * 8))], (3 * 8));
+      sa[3] = state->key_schedule[r][3] ^ sb[3];
+
+      r++;
+
+      sb[0] = rol (encT[(uint8_t) (sa[0] >> (0 * 8))], (0 * 8));
+      sb[3] = rol (encT[(uint8_t) (sa[0] >> (1 * 8))], (1 * 8));
+      sb[2] = rol (encT[(uint8_t) (sa[0] >> (2 * 8))], (2 * 8));
+      sb[1] = rol (encT[(uint8_t) (sa[0] >> (3 * 8))], (3 * 8));
+      sa[0] = state->key_schedule[r][0] ^ sb[0];
+
+      sb[1] ^= rol (encT[(uint8_t) (sa[1] >> (0 * 8))], (0 * 8));
+      sa[0] ^= rol (encT[(uint8_t) (sa[1] >> (1 * 8))], (1 * 8));
+      sb[3] ^= rol (encT[(uint8_t) (sa[1] >> (2 * 8))], (2 * 8));
+      sb[2] ^= rol (encT[(uint8_t) (sa[1] >> (3 * 8))], (3 * 8));
+      sa[1] = state->key_schedule[r][1] ^ sb[1];
+
+      sb[2] ^= rol (encT[(uint8_t) (sa[2] >> (0 * 8))], (0 * 8));
+      sa[1] ^= rol (encT[(uint8_t) (sa[2] >> (1 * 8))], (1 * 8));
+      sa[0] ^= rol (encT[(uint8_t) (sa[2] >> (2 * 8))], (2 * 8));
+      sb[3] ^= rol (encT[(uint8_t) (sa[2] >> (3 * 8))], (3 * 8));
+      sa[2] = state->key_schedule[r][2] ^ sb[2];
+
+      sb[3] ^= rol (encT[(uint8_t) (sa[3] >> (0 * 8))], (0 * 8));
+      sa[2] ^= rol (encT[(uint8_t) (sa[3] >> (1 * 8))], (1 * 8));
+      sa[1] ^= rol (encT[(uint8_t) (sa[3] >> (2 * 8))], (2 * 8));
+      sa[0] ^= rol (encT[(uint8_t) (sa[3] >> (3 * 8))], (3 * 8));
+      sa[3] = state->key_schedule[r][3] ^ sb[3];
+    }
+
+  /* Last round is special.  */
+
+  sb[0] = (sbox[(uint8_t) (sa[0] >> (0 * 8)) * 4]) << (0 * 8);
+  sb[3] = (sbox[(uint8_t) (sa[0] >> (1 * 8)) * 4]) << (1 * 8);
+  sb[2] = (sbox[(uint8_t) (sa[0] >> (2 * 8)) * 4]) << (2 * 8);
+  sb[1] = (sbox[(uint8_t) (sa[0] >> (3 * 8)) * 4]) << (3 * 8);
+  sa[0] = state->key_schedule[r][0] ^ sb[0];
+
+  sb[1] ^= (sbox[(uint8_t) (sa[1] >> (0 * 8)) * 4]) << (0 * 8);
+  sa[0] ^= (sbox[(uint8_t) (sa[1] >> (1 * 8)) * 4]) << (1 * 8);
+  sb[3] ^= (sbox[(uint8_t) (sa[1] >> (2 * 8)) * 4]) << (2 * 8);
+  sb[2] ^= (sbox[(uint8_t) (sa[1] >> (3 * 8)) * 4]) << (3 * 8);
+  sa[1] = state->key_schedule[r][1] ^ sb[1];
+
+  sb[2] ^= (sbox[(uint8_t) (sa[2] >> (0 * 8)) * 4]) << (0 * 8);
+  sa[1] ^= (sbox[(uint8_t) (sa[2] >> (1 * 8)) * 4]) << (1 * 8);
+  sa[0] ^= (sbox[(uint8_t) (sa[2] >> (2 * 8)) * 4]) << (2 * 8);
+  sb[3] ^= (sbox[(uint8_t) (sa[2] >> (3 * 8)) * 4]) << (3 * 8);
+  sa[2] = state->key_schedule[r][2] ^ sb[2];
+
+  sb[3] ^= (sbox[(uint8_t) (sa[3] >> (0 * 8)) * 4]) << (0 * 8);
+  sa[2] ^= (sbox[(uint8_t) (sa[3] >> (1 * 8)) * 4]) << (1 * 8);
+  sa[1] ^= (sbox[(uint8_t) (sa[3] >> (2 * 8)) * 4]) << (2 * 8);
+  sa[0] ^= (sbox[(uint8_t) (sa[3] >> (3 * 8)) * 4]) << (3 * 8);
+  sa[3] = state->key_schedule[r][3] ^ sb[3];
+
+  b->data[0] = le_bswap32 (sa[0]);
+  b->data[1] = le_bswap32 (sa[1]);
+  b->data[2] = le_bswap32 (sa[2]);
+  b->data[3] = le_bswap32 (sa[3]);
+}
+
+
+/* The encT table is derived from the rijndael-tables.h file in
+   libgcrypt 1.8.1.  */
+
+/* rijndael-tables.h - Rijndael (AES) for GnuPG,
+ * Copyright (C) 2000, 2001, 2002, 2003, 2007,
+ *               2008 Free Software Foundation, Inc.
+ *
+ * This file is part of Libgcrypt.
+ *
+ * Libgcrypt is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as
+ * published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * Libgcrypt is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+static const uint32_t encT[256] =
+  {
+    0xa56363c6, 0x847c7cf8, 0x997777ee, 0x8d7b7bf6,
+    0x0df2f2ff, 0xbd6b6bd6, 0xb16f6fde, 0x54c5c591,
+    0x50303060, 0x03010102, 0xa96767ce, 0x7d2b2b56,
+    0x19fefee7, 0x62d7d7b5, 0xe6abab4d, 0x9a7676ec,
+    0x45caca8f, 0x9d82821f, 0x40c9c989, 0x877d7dfa,
+    0x15fafaef, 0xeb5959b2, 0xc947478e, 0x0bf0f0fb,
+    0xecadad41, 0x67d4d4b3, 0xfda2a25f, 0xeaafaf45,
+    0xbf9c9c23, 0xf7a4a453, 0x967272e4, 0x5bc0c09b,
+    0xc2b7b775, 0x1cfdfde1, 0xae93933d, 0x6a26264c,
+    0x5a36366c, 0x413f3f7e, 0x02f7f7f5, 0x4fcccc83,
+    0x5c343468, 0xf4a5a551, 0x34e5e5d1, 0x08f1f1f9,
+    0x937171e2, 0x73d8d8ab, 0x53313162, 0x3f15152a,
+    0x0c040408, 0x52c7c795, 0x65232346, 0x5ec3c39d,
+    0x28181830, 0xa1969637, 0x0f05050a, 0xb59a9a2f,
+    0x0907070e, 0x36121224, 0x9b80801b, 0x3de2e2df,
+    0x26ebebcd, 0x6927274e, 0xcdb2b27f, 0x9f7575ea,
+    0x1b090912, 0x9e83831d, 0x742c2c58, 0x2e1a1a34,
+    0x2d1b1b36, 0xb26e6edc, 0xee5a5ab4, 0xfba0a05b,
+    0xf65252a4, 0x4d3b3b76, 0x61d6d6b7, 0xceb3b37d,
+    0x7b292952, 0x3ee3e3dd, 0x712f2f5e, 0x97848413,
+    0xf55353a6, 0x68d1d1b9, 0x00000000, 0x2cededc1,
+    0x60202040, 0x1ffcfce3, 0xc8b1b179, 0xed5b5bb6,
+    0xbe6a6ad4, 0x46cbcb8d, 0xd9bebe67, 0x4b393972,
+    0xde4a4a94, 0xd44c4c98, 0xe85858b0, 0x4acfcf85,
+    0x6bd0d0bb, 0x2aefefc5, 0xe5aaaa4f, 0x16fbfbed,
+    0xc5434386, 0xd74d4d9a, 0x55333366, 0x94858511,
+    0xcf45458a, 0x10f9f9e9, 0x06020204, 0x817f7ffe,
+    0xf05050a0, 0x443c3c78, 0xba9f9f25, 0xe3a8a84b,
+    0xf35151a2, 0xfea3a35d, 0xc0404080, 0x8a8f8f05,
+    0xad92923f, 0xbc9d9d21, 0x48383870, 0x04f5f5f1,
+    0xdfbcbc63, 0xc1b6b677, 0x75dadaaf, 0x63212142,
+    0x30101020, 0x1affffe5, 0x0ef3f3fd, 0x6dd2d2bf,
+    0x4ccdcd81, 0x140c0c18, 0x35131326, 0x2fececc3,
+    0xe15f5fbe, 0xa2979735, 0xcc444488, 0x3917172e,
+    0x57c4c493, 0xf2a7a755, 0x827e7efc, 0x473d3d7a,
+    0xac6464c8, 0xe75d5dba, 0x2b191932, 0x957373e6,
+    0xa06060c0, 0x98818119, 0xd14f4f9e, 0x7fdcdca3,
+    0x66222244, 0x7e2a2a54, 0xab90903b, 0x8388880b,
+    0xca46468c, 0x29eeeec7, 0xd3b8b86b, 0x3c141428,
+    0x79dedea7, 0xe25e5ebc, 0x1d0b0b16, 0x76dbdbad,
+    0x3be0e0db, 0x56323264, 0x4e3a3a74, 0x1e0a0a14,
+    0xdb494992, 0x0a06060c, 0x6c242448, 0xe45c5cb8,
+    0x5dc2c29f, 0x6ed3d3bd, 0xefacac43, 0xa66262c4,
+    0xa8919139, 0xa4959531, 0x37e4e4d3, 0x8b7979f2,
+    0x32e7e7d5, 0x43c8c88b, 0x5937376e, 0xb76d6dda,
+    0x8c8d8d01, 0x64d5d5b1, 0xd24e4e9c, 0xe0a9a949,
+    0xb46c6cd8, 0xfa5656ac, 0x07f4f4f3, 0x25eaeacf,
+    0xaf6565ca, 0x8e7a7af4, 0xe9aeae47, 0x18080810,
+    0xd5baba6f, 0x887878f0, 0x6f25254a, 0x722e2e5c,
+    0x241c1c38, 0xf1a6a657, 0xc7b4b473, 0x51c6c697,
+    0x23e8e8cb, 0x7cdddda1, 0x9c7474e8, 0x211f1f3e,
+    0xdd4b4b96, 0xdcbdbd61, 0x868b8b0d, 0x858a8a0f,
+    0x907070e0, 0x423e3e7c, 0xc4b5b571, 0xaa6666cc,
+    0xd8484890, 0x05030306, 0x01f6f6f7, 0x120e0e1c,
+    0xa36161c2, 0x5f35356a, 0xf95757ae, 0xd0b9b969,
+    0x91868617, 0x58c1c199, 0x271d1d3a, 0xb99e9e27,
+    0x38e1e1d9, 0x13f8f8eb, 0xb398982b, 0x33111122,
+    0xbb6969d2, 0x70d9d9a9, 0x898e8e07, 0xa7949433,
+    0xb69b9b2d, 0x221e1e3c, 0x92878715, 0x20e9e9c9,
+    0x49cece87, 0xff5555aa, 0x78282850, 0x7adfdfa5,
+    0x8f8c8c03, 0xf8a1a159, 0x80898909, 0x170d0d1a,
+    0xdabfbf65, 0x31e6e6d7, 0xc6424284, 0xb86868d0,
+    0xc3414182, 0xb0999929, 0x772d2d5a, 0x110f0f1e,
+    0xcbb0b07b, 0xfc5454a8, 0xd6bbbb6d, 0x3a16162c
+  };
+
+static const uint32_t rcon[30] =
+  {
+    0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36, 0x6c,
+    0xd8, 0xab, 0x4d, 0x9a, 0x2f, 0x5e, 0xbc, 0x63, 0xc6, 0x97, 0x35,
+    0x6a, 0xd4, 0xb3, 0x7d, 0xfa, 0xef, 0xc5, 0x91
+  };
diff --git a/elf/tst-dl-arc4random-static.c b/elf/tst-dl-arc4random-static.c
new file mode 100644
index 0000000000..d66b342a06
--- /dev/null
+++ b/elf/tst-dl-arc4random-static.c
@@ -0,0 +1,19 @@ 
+/* Test for low-level AES-128 routines for the arc4random PRNG, static version.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "tst-dl-arc4random.c"
diff --git a/elf/tst-dl-arc4random.c b/elf/tst-dl-arc4random.c
new file mode 100644
index 0000000000..fbdb2a1b0f
--- /dev/null
+++ b/elf/tst-dl-arc4random.c
@@ -0,0 +1,48 @@ 
+/* Test for low-level AES-128 routines for the arc4random PRNG.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <arc4random.h>
+#include <string.h>
+#include <support/check.h>
+
+static int
+do_test (void)
+{
+  unsigned char at_random[16]
+    = { 1, 2, 3, 4, 5, 6, 7, 8,
+        131, 132, 133, 134, 135, 136, 137, 138};
+
+  struct arc4random_global_state state;
+  memset (&state, 0, sizeof (state));
+  _dl_arc4random_schedule (&state, at_random);
+
+  struct arc4random_personalization personalization;
+  _Static_assert (sizeof (personalization) == 16, "sizeof (personalization)");
+  memcpy (&personalization,
+          "\xf7\xf6\xf5\xf4\xf3\xf2\xf1\xf0"
+          "\x00\x01\x02\x03\x04\x05\x06\x07",
+          sizeof (personalization));
+  struct arc4random_block result;
+  _dl_arc4random_block (&state, personalization, &result);
+
+  TEST_COMPARE_BLOB (&result, sizeof (result),
+                     "\xa8\x0b\xa2\x8a\xdd\x9ew\\\000aK\xdc/\xa7\xd5\x16", 16);
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/include/arc4random.h b/include/arc4random.h
new file mode 100644
index 0000000000..d7f48f4e4f
--- /dev/null
+++ b/include/arc4random.h
@@ -0,0 +1,99 @@ 
+/* Internal definitions for the arc4random PRNG.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* The random number generator uses AES-128 to encrypt a pair of
+   64-bit numbers, a personalization number and a block count.
+
+   Without reliable fork detection, the global state (in struct
+   arc4random_global_state) is mapped with the MAP_SHARED flag, so
+   that it is shared across fork.  The global state contains the AES
+   key schedule (derived from the 16-byte secret key used for AES-128
+   encrypted), and the next block number to use.
+
+   The per-thread state in struct arc4random_thread_state contains
+   cached, yet unused results of an AES operation, and information
+   required to validate that the cache is still valid.  Due to lack of
+   fork detection, any access to the cache needs to advance the global
+   block counter (and check that no other thread or process has
+   advanced it).
+
+   Once we have reliabe fork detection, a truly thread-local cache is
+   possible.  */
+
+#ifndef _ARC4RANDOM_H
+#define _ARC4RANDOM_H
+
+#include <atomic.h>
+#include <stdint.h>
+
+/* AES-128 is specified to use 10 rounds.  */
+enum { arc4random_aes_rounds = 10 };
+
+/* AES-128 produces output blocks of 16 bytes.  */
+enum { arc4random_block_size = 16 };
+
+struct arc4random_block
+{
+  uint32_t data[4];
+};
+
+/* The data which is encrypted using AES-128.  The personalization
+   value must come from struct arc4random_global_state.  */
+struct arc4random_personalization
+{
+  /* Unique number assigned to this thread.  Note that the ID is *not*
+     necessarily unique across threads in different processes.
+     Therefore, it is still necessary to ensure divergence of the
+     random bit streams by other means.  ID 0 is reserved for the
+     dynamic linker itself.  */
+  uint64_t thread_id;
+
+  /* The block number within a single thread.  This is advanced each
+     time a new block of randomness is obtained.  */
+  uint64_t block_number;
+};
+_Static_assert (sizeof (struct arc4random_personalization)
+                == arc4random_block_size,
+                "personalization size matches AES-128 block size");
+
+/* The global random number generator state.  This is located on a
+   page allocated with MAP_SHARED, so that it is inherited across
+   fork.  rtld_global contains a pointer to such an object.  */
+struct arc4random_global_state
+{
+  /* AES key schedule.  Show the alignment to the compiler.  (In
+     practice, this struct is allocated at a page-aligned
+     address.)  */
+  uint32_t key_schedule[arc4random_aes_rounds + 1][4]
+    __attribute__ ((aligned (128)));
+};
+
+/* Initializes the AES-128 key schedule.  KEY must point to 16 bytes
+   of key material.  Exported from ld.so for use with reseeding.  */
+void _dl_arc4random_schedule (struct arc4random_global_state *,
+                              const unsigned char *key);
+rtld_hidden_proto (_dl_arc4random_schedule)
+
+/* Computes one block of random data.  This function is exported from
+   ld.so for use within libc.so.  */
+void _dl_arc4random_block (const struct arc4random_global_state *,
+                           const struct arc4random_personalization,
+                           struct arc4random_block *output);
+rtld_hidden_proto (_dl_arc4random_block)
+
+#endif /* _ARC4RANDOM_H */