[v6,03/10] benchtests: Add arc4random benchtest

Message ID 20220518191424.3630729-4-adhemerval.zanella@linaro.org
State Superseded
Headers
Series Add arc4random support |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Adhemerval Zanella May 18, 2022, 7:14 p.m. UTC
  It shows both throughput (total bytes obtained in the test duration)
and latecy for both arc4random and arc4random_buf with different
sizes.

Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu.
---
 benchtests/Makefile           |   5 +-
 benchtests/bench-arc4random.c | 224 ++++++++++++++++++++++++++++++++++
 stdlib/arc4random.c           |  14 +--
 3 files changed, 235 insertions(+), 8 deletions(-)
 create mode 100644 benchtests/bench-arc4random.c
  

Comments

Florian Weimer June 28, 2022, 12:01 p.m. UTC | #1
* Adhemerval Zanella via Libc-alpha:

> diff --git a/benchtests/bench-arc4random.c b/benchtests/bench-arc4random.c
> new file mode 100644
> index 0000000000..626f2ba48c
> --- /dev/null
> +++ b/benchtests/bench-arc4random.c
> @@ -0,0 +1,224 @@

> +/* Prevent compiler to optimize away call.  */
> +#define DO_NOT_OPTIMIZE(__value)		\
> +  ({						\
> +    __typeof (__value) __v = (__value);		\
> +    asm volatile("" : : "r,m" (__v) : "memory");\
> +  })

Missing space after “volatile”.

Maybe this should be a generic building block for benchtests?

> +static void
> +run_bench (json_ctx_t *json_ctx, const char *name,
> +	   char *const*fnames, size_t fnameslen,
> +	   void (*bench)(json_ctx_t *ctx))

Missing space between “)(”.


> diff --git a/stdlib/arc4random.c b/stdlib/arc4random.c
> index f6553dfd7d..74c917b9b9 100644
> --- a/stdlib/arc4random.c
> +++ b/stdlib/arc4random.c
> @@ -25,9 +25,9 @@
>  #include <sys/param.h>
>  #include <sys/random.h>
>  
> -/* Besides the cipher state 'ctx', it keeps two counters: 'have' is the
> -   current valid bytes not yet consumed in 'buf', while 'count' is the maximum
> -   number of bytes until a reseed.
> +/* arc4random keeps two counters: 'have' is the current valid bytes not yet
> +   consumed in 'buf' while 'count' is the maximum number of bytes until a
> +   reseed.
>  
>     Both the initial seed and reseed try to obtain entropy from the kernel
>     and abort the process if none could be obtained.
> @@ -81,10 +81,10 @@ arc4random_getrandom_failure (void)
>    __libc_fatal ("Fatal glibc error: Cannot get entropy for arc4random\n");
>  }
>  
> -/* Fork detection is done by checking if MADV_WIPEONFORK supported.  If not
> -   the fork callback will reset the state on the fork call.  It does not
> -   handle direct clone calls, nor vfork or _Fork (arc4random is not
> -   async-signal-safe due the internal lock usage).  */
> +/* If the kernel supports MADV_WIPEONFORK it is used to provide fork
> +   detection. Otherwise, the state is resetied with an atfork handler. The
> +   latter does not handle direct clone calls, nor vfork or _Fork (arc4random
> +   is not async-signal-safe due to the internal lock usage).  */
>  static void
>  arc4random_init (uint8_t *buf)
>  {

I think this belongs into some other patch.

Thanks,
Florian
  
Adhemerval Zanella June 28, 2022, 5:06 p.m. UTC | #2
> On 28 Jun 2022, at 09:01, Florian Weimer <fweimer@redhat.com> wrote:
> 
> * Adhemerval Zanella via Libc-alpha:
> 
>> diff --git a/benchtests/bench-arc4random.c b/benchtests/bench-arc4random.c
>> new file mode 100644
>> index 0000000000..626f2ba48c
>> --- /dev/null
>> +++ b/benchtests/bench-arc4random.c
>> @@ -0,0 +1,224 @@
> 
>> +/* Prevent compiler to optimize away call.  */
>> +#define DO_NOT_OPTIMIZE(__value)		\
>> +  ({						\
>> +    __typeof (__value) __v = (__value);		\
>> +    asm volatile("" : : "r,m" (__v) : "memory");\
>> +  })
> 
> Missing space after “volatile”.

Ack.

> 
> Maybe this should be a generic building block for benchtests?

Indeed, I moved it to benchtests/bench-util.h.

> 
>> +static void
>> +run_bench (json_ctx_t *json_ctx, const char *name,
>> +	   char *const*fnames, size_t fnameslen,
>> +	   void (*bench)(json_ctx_t *ctx))
> 
> Missing space between “)(”.

Ack.

> 
> 
>> diff --git a/stdlib/arc4random.c b/stdlib/arc4random.c
>> index f6553dfd7d..74c917b9b9 100644
>> --- a/stdlib/arc4random.c
>> +++ b/stdlib/arc4random.c
>> @@ -25,9 +25,9 @@
>> #include <sys/param.h>
>> #include <sys/random.h>
>> 
>> -/* Besides the cipher state 'ctx', it keeps two counters: 'have' is the
>> -   current valid bytes not yet consumed in 'buf', while 'count' is the maximum
>> -   number of bytes until a reseed.
>> +/* arc4random keeps two counters: 'have' is the current valid bytes not yet
>> +   consumed in 'buf' while 'count' is the maximum number of bytes until a
>> +   reseed.
>> 
>>    Both the initial seed and reseed try to obtain entropy from the kernel
>>    and abort the process if none could be obtained.
>> @@ -81,10 +81,10 @@ arc4random_getrandom_failure (void)
>>   __libc_fatal ("Fatal glibc error: Cannot get entropy for arc4random\n");
>> }
>> 
>> -/* Fork detection is done by checking if MADV_WIPEONFORK supported.  If not
>> -   the fork callback will reset the state on the fork call.  It does not
>> -   handle direct clone calls, nor vfork or _Fork (arc4random is not
>> -   async-signal-safe due the internal lock usage).  */
>> +/* If the kernel supports MADV_WIPEONFORK it is used to provide fork
>> +   detection. Otherwise, the state is resetied with an atfork handler. The
>> +   latter does not handle direct clone calls, nor vfork or _Fork (arc4random
>> +   is not async-signal-safe due to the internal lock usage).  */
>> static void
>> arc4random_init (uint8_t *buf)
>> {
> 
> I think this belongs into some other patch.

Indeed, I will fix it.

> 
> Thanks,
> Florian
>
  
Noah Goldstein June 28, 2022, 5:13 p.m. UTC | #3
On Wed, May 18, 2022 at 12:15 PM Adhemerval Zanella via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> It shows both throughput (total bytes obtained in the test duration)
> and latecy for both arc4random and arc4random_buf with different
> sizes.
>
> Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu.
> ---
>  benchtests/Makefile           |   5 +-
>  benchtests/bench-arc4random.c | 224 ++++++++++++++++++++++++++++++++++
>  stdlib/arc4random.c           |  14 +--
>  3 files changed, 235 insertions(+), 8 deletions(-)
>  create mode 100644 benchtests/bench-arc4random.c
>
> diff --git a/benchtests/Makefile b/benchtests/Makefile
> index de9de5cf58..76583d45a3 100644
> --- a/benchtests/Makefile
> +++ b/benchtests/Makefile
> @@ -227,7 +227,10 @@ LOCALES := \
>  include ../gen-locales.mk
>  endif
>
> -stdlib-benchset := strtod
> +stdlib-benchset := \
> +  arc4random \
> +  strtod \
> +  # stdlib-benchset
>
>  stdio-common-benchset := sprintf
>
> diff --git a/benchtests/bench-arc4random.c b/benchtests/bench-arc4random.c
> new file mode 100644
> index 0000000000..626f2ba48c
> --- /dev/null
> +++ b/benchtests/bench-arc4random.c
> @@ -0,0 +1,224 @@
> +/* arc4random benchmarks.
> +   Copyright (C) 2022 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include "bench-timing.h"
> +#include "json-lib.h"
> +#include <array_length.h>
> +#include <intprops.h>
> +#include <signal.h>
> +#include <stdbool.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <support/support.h>
> +#include <support/timespec.h>
> +#include <support/xthread.h>
> +
> +/* Prevent compiler to optimize away call.  */
> +#define DO_NOT_OPTIMIZE(__value)               \
> +  ({                                           \
> +    __typeof (__value) __v = (__value);                \
> +    asm volatile("" : : "r,m" (__v) : "memory");\
> +  })

This might be useful to have in the support header.
bench-hash-func.c has essentially the same code.

> +
> +static volatile sig_atomic_t timer_finished;
> +
> +static void timer_callback (int unused)
> +{
> +  timer_finished = 1;
> +}
> +
> +static timer_t timer;
> +
> +/* Run for approximately DURATION seconds, and it does not matter who
> +   receive the signal (so not need to mask it on main thread).  */
> +static void
> +timer_start (void)
> +{
> +  timer_finished = 0;
> +  timer = support_create_timer (DURATION, 0, false, timer_callback);
> +}
> +static void
> +timer_stop (void)
> +{
> +  support_delete_timer (timer);
> +}
> +
> +static const uint32_t sizes[] = { 0, 16, 32, 48, 64, 80, 96, 112, 128 };
> +
> +static double
> +bench_throughput (void)
> +{
> +  uint64_t n = 0;
> +
> +  struct timespec start, end;
> +  clock_gettime (CLOCK_MONOTONIC, &start);
> +  while (1)
> +    {
> +      DO_NOT_OPTIMIZE (arc4random ());
> +      n++;
> +
> +      if (timer_finished == 1)
> +       break;
> +    }
> +  clock_gettime (CLOCK_MONOTONIC, &end);
> +  struct timespec diff = timespec_sub (end, start);
> +
> +  double total = (double) n * sizeof (uint32_t);
> +  double duration = (double) diff.tv_sec
> +    + (double) diff.tv_nsec / TIMESPEC_HZ;
> +
> +  return total / duration;
> +}
> +
> +static double
> +bench_latency (void)
> +{
> +  timing_t start, stop, cur;
> +  const size_t iters = 1024;
> +
> +  TIMING_NOW (start);
> +  for (size_t i = 0; i < iters; i++)
> +    DO_NOT_OPTIMIZE (arc4random ());
> +  TIMING_NOW (stop);
> +
> +  TIMING_DIFF (cur, start, stop);
> +
> +  return (double) (cur) / (double) iters;
> +}
> +
> +static double
> +bench_buf_throughput (size_t len)
> +{
> +  uint8_t buf[len];
> +  uint64_t n = 0;
> +
> +  struct timespec start, end;
> +  clock_gettime (CLOCK_MONOTONIC, &start);
> +  while (1)
> +    {
> +      arc4random_buf (buf, len);
> +      n++;
> +
> +      if (timer_finished == 1)
> +       break;
> +    }
> +  clock_gettime (CLOCK_MONOTONIC, &end);
> +  struct timespec diff = timespec_sub (end, start);
> +
> +  double total = (double) n * len;
> +  double duration = (double) diff.tv_sec
> +    + (double) diff.tv_nsec / TIMESPEC_HZ;
> +
> +  return total / duration;
> +}
> +
> +static double
> +bench_buf_latency (size_t len)
> +{
> +  timing_t start, stop, cur;
> +  const size_t iters = 1024;
> +
> +  uint8_t buf[len];
> +
> +  TIMING_NOW (start);
> +  for (size_t i = 0; i < iters; i++)
> +    arc4random_buf (buf, len);
> +  TIMING_NOW (stop);
> +
> +  TIMING_DIFF (cur, start, stop);
> +
> +  return (double) (cur) / (double) iters;
> +}
> +
> +static void
> +bench_singlethread (json_ctx_t *json_ctx)
> +{
> +  json_element_object_begin (json_ctx);
> +
> +  json_array_begin (json_ctx, "throughput");
> +  for (int i = 0; i < array_length (sizes); i++)
> +    {
> +      timer_start ();
> +      double r = sizes[i] == 0
> +       ? bench_throughput () : bench_buf_throughput (sizes[i]);
> +      timer_stop ();
> +
> +      json_element_double (json_ctx, r);
> +    }
> +  json_array_end (json_ctx);
> +
> +  json_array_begin (json_ctx, "latency");
> +  for (int i = 0; i < array_length (sizes); i++)
> +    {
> +      timer_start ();
> +      double r = sizes[i] == 0
> +       ? bench_latency () : bench_buf_latency (sizes[i]);
> +      timer_stop ();
> +
> +      json_element_double (json_ctx, r);
> +    }
> +  json_array_end (json_ctx);
> +
> +  json_element_object_end (json_ctx);
> +}
> +
> +static void
> +run_bench (json_ctx_t *json_ctx, const char *name,
> +          char *const*fnames, size_t fnameslen,
> +          void (*bench)(json_ctx_t *ctx))
> +{
> +  json_attr_object_begin (json_ctx, name);
> +  json_array_begin (json_ctx, "functions");
> +  for (int i = 0; i < fnameslen; i++)
> +    json_element_string (json_ctx, fnames[i]);
> +  json_array_end (json_ctx);
> +
> +  json_array_begin (json_ctx, "results");
> +  bench (json_ctx);
> +  json_array_end (json_ctx);
> +  json_attr_object_end (json_ctx);
> +}
> +
> +static int
> +do_test (void)
> +{
> +  char *fnames[array_length (sizes)];
> +  for (int i = 0; i < array_length (sizes); i++)
> +    if (sizes[i] == 0)
> +      fnames[i] = xasprintf ("arc4random");
> +    else
> +      fnames[i] = xasprintf ("arc4random_buf(%u)", sizes[i]);
> +
> +  json_ctx_t json_ctx;
> +  json_init (&json_ctx, 0, stdout);
> +
> +  json_document_begin (&json_ctx);
> +  json_attr_string (&json_ctx, "timing_type", TIMING_TYPE);
> +
> +  run_bench (&json_ctx, "single-thread", fnames, array_length (fnames),
> +            bench_singlethread);
> +
> +  json_document_end (&json_ctx);
> +
> +  for (int i = 0; i < array_length (sizes); i++)
> +    free (fnames[i]);
> +
> +  return 0;
> +}
> +
> +#include <support/test-driver.c>
> diff --git a/stdlib/arc4random.c b/stdlib/arc4random.c
> index f6553dfd7d..74c917b9b9 100644
> --- a/stdlib/arc4random.c
> +++ b/stdlib/arc4random.c
> @@ -25,9 +25,9 @@
>  #include <sys/param.h>
>  #include <sys/random.h>
>
> -/* Besides the cipher state 'ctx', it keeps two counters: 'have' is the
> -   current valid bytes not yet consumed in 'buf', while 'count' is the maximum
> -   number of bytes until a reseed.
> +/* arc4random keeps two counters: 'have' is the current valid bytes not yet
> +   consumed in 'buf' while 'count' is the maximum number of bytes until a
> +   reseed.
>
>     Both the initial seed and reseed try to obtain entropy from the kernel
>     and abort the process if none could be obtained.
> @@ -81,10 +81,10 @@ arc4random_getrandom_failure (void)
>    __libc_fatal ("Fatal glibc error: Cannot get entropy for arc4random\n");
>  }
>
> -/* Fork detection is done by checking if MADV_WIPEONFORK supported.  If not
> -   the fork callback will reset the state on the fork call.  It does not
> -   handle direct clone calls, nor vfork or _Fork (arc4random is not
> -   async-signal-safe due the internal lock usage).  */
> +/* If the kernel supports MADV_WIPEONFORK it is used to provide fork
> +   detection. Otherwise, the state is resetied with an atfork handler. The
> +   latter does not handle direct clone calls, nor vfork or _Fork (arc4random
> +   is not async-signal-safe due to the internal lock usage).  */
>  static void
>  arc4random_init (uint8_t *buf)
>  {
> --
> 2.34.1
>
  
Noah Goldstein June 28, 2022, 5:14 p.m. UTC | #4
On Tue, Jun 28, 2022 at 10:13 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Wed, May 18, 2022 at 12:15 PM Adhemerval Zanella via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
> >
> > It shows both throughput (total bytes obtained in the test duration)
> > and latecy for both arc4random and arc4random_buf with different
> > sizes.
> >
> > Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu.
> > ---
> >  benchtests/Makefile           |   5 +-
> >  benchtests/bench-arc4random.c | 224 ++++++++++++++++++++++++++++++++++
> >  stdlib/arc4random.c           |  14 +--
> >  3 files changed, 235 insertions(+), 8 deletions(-)
> >  create mode 100644 benchtests/bench-arc4random.c
> >
> > diff --git a/benchtests/Makefile b/benchtests/Makefile
> > index de9de5cf58..76583d45a3 100644
> > --- a/benchtests/Makefile
> > +++ b/benchtests/Makefile
> > @@ -227,7 +227,10 @@ LOCALES := \
> >  include ../gen-locales.mk
> >  endif
> >
> > -stdlib-benchset := strtod
> > +stdlib-benchset := \
> > +  arc4random \
> > +  strtod \
> > +  # stdlib-benchset
> >
> >  stdio-common-benchset := sprintf
> >
> > diff --git a/benchtests/bench-arc4random.c b/benchtests/bench-arc4random.c
> > new file mode 100644
> > index 0000000000..626f2ba48c
> > --- /dev/null
> > +++ b/benchtests/bench-arc4random.c
> > @@ -0,0 +1,224 @@
> > +/* arc4random benchmarks.
> > +   Copyright (C) 2022 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#include "bench-timing.h"
> > +#include "json-lib.h"
> > +#include <array_length.h>
> > +#include <intprops.h>
> > +#include <signal.h>
> > +#include <stdbool.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <support/support.h>
> > +#include <support/timespec.h>
> > +#include <support/xthread.h>
> > +
> > +/* Prevent compiler to optimize away call.  */
> > +#define DO_NOT_OPTIMIZE(__value)               \
> > +  ({                                           \
> > +    __typeof (__value) __v = (__value);                \
> > +    asm volatile("" : : "r,m" (__v) : "memory");\
> > +  })
>
> This might be useful to have in the support header.
> bench-hash-func.c has essentially the same code.

Ah florian beat me to it.

Can you update bench-hash-func.c when you put
this in a header?

Defined in bench-hash-func.c and used in bench-hash-func-kernel.h
>
> > +
> > +static volatile sig_atomic_t timer_finished;
> > +
> > +static void timer_callback (int unused)
> > +{
> > +  timer_finished = 1;
> > +}
> > +
> > +static timer_t timer;
> > +
> > +/* Run for approximately DURATION seconds, and it does not matter who
> > +   receive the signal (so not need to mask it on main thread).  */
> > +static void
> > +timer_start (void)
> > +{
> > +  timer_finished = 0;
> > +  timer = support_create_timer (DURATION, 0, false, timer_callback);
> > +}
> > +static void
> > +timer_stop (void)
> > +{
> > +  support_delete_timer (timer);
> > +}
> > +
> > +static const uint32_t sizes[] = { 0, 16, 32, 48, 64, 80, 96, 112, 128 };
> > +
> > +static double
> > +bench_throughput (void)
> > +{
> > +  uint64_t n = 0;
> > +
> > +  struct timespec start, end;
> > +  clock_gettime (CLOCK_MONOTONIC, &start);
> > +  while (1)
> > +    {
> > +      DO_NOT_OPTIMIZE (arc4random ());
> > +      n++;
> > +
> > +      if (timer_finished == 1)
> > +       break;
> > +    }
> > +  clock_gettime (CLOCK_MONOTONIC, &end);
> > +  struct timespec diff = timespec_sub (end, start);
> > +
> > +  double total = (double) n * sizeof (uint32_t);
> > +  double duration = (double) diff.tv_sec
> > +    + (double) diff.tv_nsec / TIMESPEC_HZ;
> > +
> > +  return total / duration;
> > +}
> > +
> > +static double
> > +bench_latency (void)
> > +{
> > +  timing_t start, stop, cur;
> > +  const size_t iters = 1024;
> > +
> > +  TIMING_NOW (start);
> > +  for (size_t i = 0; i < iters; i++)
> > +    DO_NOT_OPTIMIZE (arc4random ());
> > +  TIMING_NOW (stop);
> > +
> > +  TIMING_DIFF (cur, start, stop);
> > +
> > +  return (double) (cur) / (double) iters;
> > +}
> > +
> > +static double
> > +bench_buf_throughput (size_t len)
> > +{
> > +  uint8_t buf[len];
> > +  uint64_t n = 0;
> > +
> > +  struct timespec start, end;
> > +  clock_gettime (CLOCK_MONOTONIC, &start);
> > +  while (1)
> > +    {
> > +      arc4random_buf (buf, len);
> > +      n++;
> > +
> > +      if (timer_finished == 1)
> > +       break;
> > +    }
> > +  clock_gettime (CLOCK_MONOTONIC, &end);
> > +  struct timespec diff = timespec_sub (end, start);
> > +
> > +  double total = (double) n * len;
> > +  double duration = (double) diff.tv_sec
> > +    + (double) diff.tv_nsec / TIMESPEC_HZ;
> > +
> > +  return total / duration;
> > +}
> > +
> > +static double
> > +bench_buf_latency (size_t len)
> > +{
> > +  timing_t start, stop, cur;
> > +  const size_t iters = 1024;
> > +
> > +  uint8_t buf[len];
> > +
> > +  TIMING_NOW (start);
> > +  for (size_t i = 0; i < iters; i++)
> > +    arc4random_buf (buf, len);
> > +  TIMING_NOW (stop);
> > +
> > +  TIMING_DIFF (cur, start, stop);
> > +
> > +  return (double) (cur) / (double) iters;
> > +}
> > +
> > +static void
> > +bench_singlethread (json_ctx_t *json_ctx)
> > +{
> > +  json_element_object_begin (json_ctx);
> > +
> > +  json_array_begin (json_ctx, "throughput");
> > +  for (int i = 0; i < array_length (sizes); i++)
> > +    {
> > +      timer_start ();
> > +      double r = sizes[i] == 0
> > +       ? bench_throughput () : bench_buf_throughput (sizes[i]);
> > +      timer_stop ();
> > +
> > +      json_element_double (json_ctx, r);
> > +    }
> > +  json_array_end (json_ctx);
> > +
> > +  json_array_begin (json_ctx, "latency");
> > +  for (int i = 0; i < array_length (sizes); i++)
> > +    {
> > +      timer_start ();
> > +      double r = sizes[i] == 0
> > +       ? bench_latency () : bench_buf_latency (sizes[i]);
> > +      timer_stop ();
> > +
> > +      json_element_double (json_ctx, r);
> > +    }
> > +  json_array_end (json_ctx);
> > +
> > +  json_element_object_end (json_ctx);
> > +}
> > +
> > +static void
> > +run_bench (json_ctx_t *json_ctx, const char *name,
> > +          char *const*fnames, size_t fnameslen,
> > +          void (*bench)(json_ctx_t *ctx))
> > +{
> > +  json_attr_object_begin (json_ctx, name);
> > +  json_array_begin (json_ctx, "functions");
> > +  for (int i = 0; i < fnameslen; i++)
> > +    json_element_string (json_ctx, fnames[i]);
> > +  json_array_end (json_ctx);
> > +
> > +  json_array_begin (json_ctx, "results");
> > +  bench (json_ctx);
> > +  json_array_end (json_ctx);
> > +  json_attr_object_end (json_ctx);
> > +}
> > +
> > +static int
> > +do_test (void)
> > +{
> > +  char *fnames[array_length (sizes)];
> > +  for (int i = 0; i < array_length (sizes); i++)
> > +    if (sizes[i] == 0)
> > +      fnames[i] = xasprintf ("arc4random");
> > +    else
> > +      fnames[i] = xasprintf ("arc4random_buf(%u)", sizes[i]);
> > +
> > +  json_ctx_t json_ctx;
> > +  json_init (&json_ctx, 0, stdout);
> > +
> > +  json_document_begin (&json_ctx);
> > +  json_attr_string (&json_ctx, "timing_type", TIMING_TYPE);
> > +
> > +  run_bench (&json_ctx, "single-thread", fnames, array_length (fnames),
> > +            bench_singlethread);
> > +
> > +  json_document_end (&json_ctx);
> > +
> > +  for (int i = 0; i < array_length (sizes); i++)
> > +    free (fnames[i]);
> > +
> > +  return 0;
> > +}
> > +
> > +#include <support/test-driver.c>
> > diff --git a/stdlib/arc4random.c b/stdlib/arc4random.c
> > index f6553dfd7d..74c917b9b9 100644
> > --- a/stdlib/arc4random.c
> > +++ b/stdlib/arc4random.c
> > @@ -25,9 +25,9 @@
> >  #include <sys/param.h>
> >  #include <sys/random.h>
> >
> > -/* Besides the cipher state 'ctx', it keeps two counters: 'have' is the
> > -   current valid bytes not yet consumed in 'buf', while 'count' is the maximum
> > -   number of bytes until a reseed.
> > +/* arc4random keeps two counters: 'have' is the current valid bytes not yet
> > +   consumed in 'buf' while 'count' is the maximum number of bytes until a
> > +   reseed.
> >
> >     Both the initial seed and reseed try to obtain entropy from the kernel
> >     and abort the process if none could be obtained.
> > @@ -81,10 +81,10 @@ arc4random_getrandom_failure (void)
> >    __libc_fatal ("Fatal glibc error: Cannot get entropy for arc4random\n");
> >  }
> >
> > -/* Fork detection is done by checking if MADV_WIPEONFORK supported.  If not
> > -   the fork callback will reset the state on the fork call.  It does not
> > -   handle direct clone calls, nor vfork or _Fork (arc4random is not
> > -   async-signal-safe due the internal lock usage).  */
> > +/* If the kernel supports MADV_WIPEONFORK it is used to provide fork
> > +   detection. Otherwise, the state is resetied with an atfork handler. The
> > +   latter does not handle direct clone calls, nor vfork or _Fork (arc4random
> > +   is not async-signal-safe due to the internal lock usage).  */
> >  static void
> >  arc4random_init (uint8_t *buf)
> >  {
> > --
> > 2.34.1
> >
  
Adhemerval Zanella June 28, 2022, 5:22 p.m. UTC | #5
> On 28 Jun 2022, at 14:14, Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> 
> On Tue, Jun 28, 2022 at 10:13 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:

> Ah florian beat me to it.
> 
> Can you update bench-hash-func.c when you put
> this in a header?
> 
> Defined in bench-hash-func.c and used in bench-hash-func-kernel.h

Alright.
  

Patch

diff --git a/benchtests/Makefile b/benchtests/Makefile
index de9de5cf58..76583d45a3 100644
--- a/benchtests/Makefile
+++ b/benchtests/Makefile
@@ -227,7 +227,10 @@  LOCALES := \
 include ../gen-locales.mk
 endif
 
-stdlib-benchset := strtod
+stdlib-benchset := \
+  arc4random \
+  strtod \
+  # stdlib-benchset
 
 stdio-common-benchset := sprintf
 
diff --git a/benchtests/bench-arc4random.c b/benchtests/bench-arc4random.c
new file mode 100644
index 0000000000..626f2ba48c
--- /dev/null
+++ b/benchtests/bench-arc4random.c
@@ -0,0 +1,224 @@ 
+/* arc4random benchmarks.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include "bench-timing.h"
+#include "json-lib.h"
+#include <array_length.h>
+#include <intprops.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <support/support.h>
+#include <support/timespec.h>
+#include <support/xthread.h>
+
+/* Prevent compiler to optimize away call.  */
+#define DO_NOT_OPTIMIZE(__value)		\
+  ({						\
+    __typeof (__value) __v = (__value);		\
+    asm volatile("" : : "r,m" (__v) : "memory");\
+  })
+
+static volatile sig_atomic_t timer_finished;
+
+static void timer_callback (int unused)
+{
+  timer_finished = 1;
+}
+
+static timer_t timer;
+
+/* Run for approximately DURATION seconds, and it does not matter who
+   receive the signal (so not need to mask it on main thread).  */
+static void
+timer_start (void)
+{
+  timer_finished = 0;
+  timer = support_create_timer (DURATION, 0, false, timer_callback);
+}
+static void
+timer_stop (void)
+{
+  support_delete_timer (timer);
+}
+
+static const uint32_t sizes[] = { 0, 16, 32, 48, 64, 80, 96, 112, 128 };
+
+static double
+bench_throughput (void)
+{
+  uint64_t n = 0;
+
+  struct timespec start, end;
+  clock_gettime (CLOCK_MONOTONIC, &start);
+  while (1)
+    {
+      DO_NOT_OPTIMIZE (arc4random ());
+      n++;
+
+      if (timer_finished == 1)
+	break;
+    }
+  clock_gettime (CLOCK_MONOTONIC, &end);
+  struct timespec diff = timespec_sub (end, start);
+
+  double total = (double) n * sizeof (uint32_t);
+  double duration = (double) diff.tv_sec
+    + (double) diff.tv_nsec / TIMESPEC_HZ;
+
+  return total / duration;
+}
+
+static double
+bench_latency (void)
+{
+  timing_t start, stop, cur;
+  const size_t iters = 1024;
+
+  TIMING_NOW (start);
+  for (size_t i = 0; i < iters; i++)
+    DO_NOT_OPTIMIZE (arc4random ());
+  TIMING_NOW (stop);
+
+  TIMING_DIFF (cur, start, stop);
+
+  return (double) (cur) / (double) iters;
+}
+
+static double
+bench_buf_throughput (size_t len)
+{
+  uint8_t buf[len];
+  uint64_t n = 0;
+
+  struct timespec start, end;
+  clock_gettime (CLOCK_MONOTONIC, &start);
+  while (1)
+    {
+      arc4random_buf (buf, len);
+      n++;
+
+      if (timer_finished == 1)
+	break;
+    }
+  clock_gettime (CLOCK_MONOTONIC, &end);
+  struct timespec diff = timespec_sub (end, start);
+
+  double total = (double) n * len;
+  double duration = (double) diff.tv_sec
+    + (double) diff.tv_nsec / TIMESPEC_HZ;
+
+  return total / duration;
+}
+
+static double
+bench_buf_latency (size_t len)
+{
+  timing_t start, stop, cur;
+  const size_t iters = 1024;
+
+  uint8_t buf[len];
+
+  TIMING_NOW (start);
+  for (size_t i = 0; i < iters; i++)
+    arc4random_buf (buf, len);
+  TIMING_NOW (stop);
+
+  TIMING_DIFF (cur, start, stop);
+
+  return (double) (cur) / (double) iters;
+}
+
+static void
+bench_singlethread (json_ctx_t *json_ctx)
+{
+  json_element_object_begin (json_ctx);
+
+  json_array_begin (json_ctx, "throughput");
+  for (int i = 0; i < array_length (sizes); i++)
+    {
+      timer_start ();
+      double r = sizes[i] == 0
+	? bench_throughput () : bench_buf_throughput (sizes[i]);
+      timer_stop ();
+
+      json_element_double (json_ctx, r);
+    }
+  json_array_end (json_ctx);
+
+  json_array_begin (json_ctx, "latency");
+  for (int i = 0; i < array_length (sizes); i++)
+    {
+      timer_start ();
+      double r = sizes[i] == 0
+	? bench_latency () : bench_buf_latency (sizes[i]);
+      timer_stop ();
+
+      json_element_double (json_ctx, r);
+    }
+  json_array_end (json_ctx);
+
+  json_element_object_end (json_ctx);
+}
+
+static void
+run_bench (json_ctx_t *json_ctx, const char *name,
+	   char *const*fnames, size_t fnameslen,
+	   void (*bench)(json_ctx_t *ctx))
+{
+  json_attr_object_begin (json_ctx, name);
+  json_array_begin (json_ctx, "functions");
+  for (int i = 0; i < fnameslen; i++)
+    json_element_string (json_ctx, fnames[i]);
+  json_array_end (json_ctx);
+
+  json_array_begin (json_ctx, "results");
+  bench (json_ctx);
+  json_array_end (json_ctx);
+  json_attr_object_end (json_ctx);
+}
+
+static int
+do_test (void)
+{
+  char *fnames[array_length (sizes)];
+  for (int i = 0; i < array_length (sizes); i++)
+    if (sizes[i] == 0)
+      fnames[i] = xasprintf ("arc4random");
+    else
+      fnames[i] = xasprintf ("arc4random_buf(%u)", sizes[i]);
+
+  json_ctx_t json_ctx;
+  json_init (&json_ctx, 0, stdout);
+
+  json_document_begin (&json_ctx);
+  json_attr_string (&json_ctx, "timing_type", TIMING_TYPE);
+
+  run_bench (&json_ctx, "single-thread", fnames, array_length (fnames),
+	     bench_singlethread);
+
+  json_document_end (&json_ctx);
+
+  for (int i = 0; i < array_length (sizes); i++)
+    free (fnames[i]);
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/stdlib/arc4random.c b/stdlib/arc4random.c
index f6553dfd7d..74c917b9b9 100644
--- a/stdlib/arc4random.c
+++ b/stdlib/arc4random.c
@@ -25,9 +25,9 @@ 
 #include <sys/param.h>
 #include <sys/random.h>
 
-/* Besides the cipher state 'ctx', it keeps two counters: 'have' is the
-   current valid bytes not yet consumed in 'buf', while 'count' is the maximum
-   number of bytes until a reseed.
+/* arc4random keeps two counters: 'have' is the current valid bytes not yet
+   consumed in 'buf' while 'count' is the maximum number of bytes until a
+   reseed.
 
    Both the initial seed and reseed try to obtain entropy from the kernel
    and abort the process if none could be obtained.
@@ -81,10 +81,10 @@  arc4random_getrandom_failure (void)
   __libc_fatal ("Fatal glibc error: Cannot get entropy for arc4random\n");
 }
 
-/* Fork detection is done by checking if MADV_WIPEONFORK supported.  If not
-   the fork callback will reset the state on the fork call.  It does not
-   handle direct clone calls, nor vfork or _Fork (arc4random is not
-   async-signal-safe due the internal lock usage).  */
+/* If the kernel supports MADV_WIPEONFORK it is used to provide fork
+   detection. Otherwise, the state is resetied with an atfork handler. The
+   latter does not handle direct clone calls, nor vfork or _Fork (arc4random
+   is not async-signal-safe due to the internal lock usage).  */
 static void
 arc4random_init (uint8_t *buf)
 {