[v2] Remove unnecessary locking when reading iconv configuration [BZ #22062]

Message ID 20181016142341.GA37762@aloka.lostca.se
State Committed
Headers

Commit Message

Arjun Shankar Oct. 16, 2018, 2:23 p.m. UTC
  In iconv/gconv_conf.c, __gconv_get_path unnecessarily obtains a lock when
populating the array pointed to by __gconv_path_elem. The locking is not
necessary because all calls to __gconv_read_conf (which in turn calls
__gconv_get_path) are serialized using __libc_once.

This patch:
- removes all locking in __gconv_get_path;
- replaces all explicitly serialized __gconv_read_conf calls with calls to
  __gconv_load_conf, a new wrapper that is serialized internally;
- adds a new test, iconv/tst-iconv-mt.c, to exercise iconv initialization,
  usage, and cleanup in a multi-threaded program;
- indents __gconv_get_path correctly, removing tab characters (which makes
  the patch look a little bigger than it really is).

After removing the unnecessary locking, it was confirmed that the test case
fails if the relevant __libc_once is removed. Additionally, four localedata
and iconvdata tests also fail. This gives confidence that the testsuite
sufficiently guards against some regressions relating to multi-threading
with iconv.

Tested on x86_64 and i686.

ChangeLog:

2018-10-16  Arjun Shankar  <arjun@redhat.com>

	[BZ #22062]
	* iconv/gconv_conf.c (__gconv_get_path): Remove locking and fix
	indentation.
	* (__gconv_read_conf): Mark function static.
	* (once): New static variable.
	* (__gconv_load_conf): New function.
	* iconv/gconv_int.h (__gconv_load_conf): Likewise.
	* iconv/gconv_db.c (once): Remove static variable.
	* (__gconv_compare_alias): Use __gconv_load_conf instead of
	__gconv_read_conf.
	* (__gconv_find_transform): Likewise.
	* iconv/tst-iconv-mt.c: New test.
	* iconv/Makefile: Add tst-iconv-mt.
---
 iconv/Makefile       |   5 +-
 iconv/gconv_conf.c   | 208 +++++++++++++++++++++++--------------------
 iconv/gconv_db.c     |   8 +-
 iconv/gconv_int.h    |   4 +-
 iconv/tst-iconv-mt.c | 142 +++++++++++++++++++++++++++++
 5 files changed, 259 insertions(+), 108 deletions(-)
 create mode 100644 iconv/tst-iconv-mt.c

This v2 of the patch addresses comments from the review of v1:
https://sourceware.org/ml/libc-alpha/2017-10/msg01195.html
  

Comments

Florian Weimer Oct. 17, 2018, 9:22 a.m. UTC | #1
* Arjun Shankar:

> 2018-10-16  Arjun Shankar  <arjun@redhat.com>
>
> 	[BZ #22062]
> 	* iconv/gconv_conf.c (__gconv_get_path): Remove locking and fix
> 	indentation.
> 	* (__gconv_read_conf): Mark function static.
> 	* (once): New static variable.
> 	* (__gconv_load_conf): New function.
> 	* iconv/gconv_int.h (__gconv_load_conf): Likewise.
> 	* iconv/gconv_db.c (once): Remove static variable.
> 	* (__gconv_compare_alias): Use __gconv_load_conf instead of
> 	__gconv_read_conf.
> 	* (__gconv_find_transform): Likewise.
> 	* iconv/tst-iconv-mt.c: New test.
> 	* iconv/Makefile: Add tst-iconv-mt.

Looks good to me.  The handling of the result variable in
__gconv_get_path is a bit awkward, but I think we should not clean this
up in this patch.  There are other issues in this function (the buffer
management, the lack of error reporting).

Please commit this to master.

Thanks,
Florian
  
Adhemerval Zanella Oct. 17, 2018, 12:19 p.m. UTC | #2
On 16/10/2018 11:23, Arjun Shankar wrote:
> -	}
> +      /* No user-defined path.  Make a modifiable copy of the
> +         default path.  */
> +      gconv_path = strdupa (default_gconv_path);
> +      gconv_path_len = sizeof (default_gconv_path);
> +      cwd = NULL;
> +      cwdlen = 0;
> +    }
> +  else
> +    {
> +      /* Append the default path to the user-defined path.  */
> +      size_t user_len = strlen (__gconv_path_envvar);
> +
> +      gconv_path_len = user_len + 1 + sizeof (default_gconv_path);
> +      gconv_path = alloca (gconv_path_len);
> +      __mempcpy (__mempcpy (__mempcpy (gconv_path, __gconv_path_envvar,
> +                                       user_len),
> +                            ":", 1),
> +                 default_gconv_path, sizeof (default_gconv_path));
> +      cwd = __getcwd (NULL, 0);
> +      cwdlen = __glibc_unlikely (cwd == NULL) ? 0 : strlen (cwd);

It this alloca really necessary? Can't we get rid of on along with fix (same
for other cases including strdupa)?
  
Carlos O'Donell Oct. 17, 2018, 1:25 p.m. UTC | #3
On 10/17/18 8:19 AM, Adhemerval Zanella wrote:
> 
> 
> On 16/10/2018 11:23, Arjun Shankar wrote:
>> -	}
>> +      /* No user-defined path.  Make a modifiable copy of the
>> +         default path.  */
>> +      gconv_path = strdupa (default_gconv_path);
>> +      gconv_path_len = sizeof (default_gconv_path);
>> +      cwd = NULL;
>> +      cwdlen = 0;
>> +    }
>> +  else
>> +    {
>> +      /* Append the default path to the user-defined path.  */
>> +      size_t user_len = strlen (__gconv_path_envvar);
>> +
>> +      gconv_path_len = user_len + 1 + sizeof (default_gconv_path);
>> +      gconv_path = alloca (gconv_path_len);
>> +      __mempcpy (__mempcpy (__mempcpy (gconv_path, __gconv_path_envvar,
>> +                                       user_len),
>> +                            ":", 1),
>> +                 default_gconv_path, sizeof (default_gconv_path));
>> +      cwd = __getcwd (NULL, 0);
>> +      cwdlen = __glibc_unlikely (cwd == NULL) ? 0 : strlen (cwd);
> 
> It this alloca really necessary? Can't we get rid of on along with fix (same
> for other cases including strdupa)?

No. Not in the same fix please. This is what Florian is talking about, there 
are buffer management issues here, but we need to handle them distinctly from
the P&C issue, which should be the *only* thing changing in this patch.
  
Adhemerval Zanella Oct. 17, 2018, 2:59 p.m. UTC | #4
On 17/10/2018 10:25, Carlos O'Donell wrote:
> On 10/17/18 8:19 AM, Adhemerval Zanella wrote:
>>
>>
>> On 16/10/2018 11:23, Arjun Shankar wrote:
>>> -	}
>>> +      /* No user-defined path.  Make a modifiable copy of the
>>> +         default path.  */
>>> +      gconv_path = strdupa (default_gconv_path);
>>> +      gconv_path_len = sizeof (default_gconv_path);
>>> +      cwd = NULL;
>>> +      cwdlen = 0;
>>> +    }
>>> +  else
>>> +    {
>>> +      /* Append the default path to the user-defined path.  */
>>> +      size_t user_len = strlen (__gconv_path_envvar);
>>> +
>>> +      gconv_path_len = user_len + 1 + sizeof (default_gconv_path);
>>> +      gconv_path = alloca (gconv_path_len);
>>> +      __mempcpy (__mempcpy (__mempcpy (gconv_path, __gconv_path_envvar,
>>> +                                       user_len),
>>> +                            ":", 1),
>>> +                 default_gconv_path, sizeof (default_gconv_path));
>>> +      cwd = __getcwd (NULL, 0);
>>> +      cwdlen = __glibc_unlikely (cwd == NULL) ? 0 : strlen (cwd);
>>
>> It this alloca really necessary? Can't we get rid of on along with fix (same
>> for other cases including strdupa)?
> 
> No. Not in the same fix please. This is what Florian is talking about, there 
> are buffer management issues here, but we need to handle them distinctly from
> the P&C issue, which should be the *only* thing changing in this patch.
> 

Ack, I agree it should be a different patch.
  

Patch

diff --git a/iconv/Makefile b/iconv/Makefile
index d71319b39e..79d03c3bbe 100644
--- a/iconv/Makefile
+++ b/iconv/Makefile
@@ -43,7 +43,8 @@  CFLAGS-charmap.c += -DCHARMAP_PATH='"$(i18ndir)/charmaps"' \
 CFLAGS-linereader.c += -DNO_TRANSLITERATION
 CFLAGS-simple-hash.c += -I../locale
 
-tests	= tst-iconv1 tst-iconv2 tst-iconv3 tst-iconv4 tst-iconv5 tst-iconv6
+tests	= tst-iconv1 tst-iconv2 tst-iconv3 tst-iconv4 tst-iconv5 tst-iconv6 \
+	  tst-iconv-mt
 
 others		= iconv_prog iconvconfig
 install-others-programs	= $(inst_bindir)/iconv
@@ -67,6 +68,8 @@  endif
 $(objpfx)gconv-modules: test-gconv-modules
 	cp $< $@
 
+$(objpfx)tst-iconv-mt: $(shared-thread-library)
+
 ifeq (yes,$(build-shared))
 tests += tst-gconv-init-failure
 modules-names += tst-gconv-init-failure-mod
diff --git a/iconv/gconv_conf.c b/iconv/gconv_conf.c
index ce9f10f3af..78010491e6 100644
--- a/iconv/gconv_conf.c
+++ b/iconv/gconv_conf.c
@@ -426,119 +426,115 @@  read_conf_file (const char *filename, const char *directory, size_t dir_len,
 }
 
 
-/* Determine the directories we are looking for data in.  */
+/* Determine the directories we are looking for data in.  This function should
+   only be called from __gconv_read_conf.  */
 static void
 __gconv_get_path (void)
 {
   struct path_elem *result;
-  __libc_lock_define_initialized (static, lock);
 
-  __libc_lock_lock (lock);
-
-  /* Make sure there wasn't a second thread doing it already.  */
-  result = (struct path_elem *) __gconv_path_elem;
-  if (result == NULL)
+  /* This function is only ever called when __gconv_path_elem is NULL.  */
+  result = __gconv_path_elem;
+  assert (result == NULL);
+
+  /* Determine the complete path first.  */
+  char *gconv_path;
+  size_t gconv_path_len;
+  char *elem;
+  char *oldp;
+  char *cp;
+  int nelems;
+  char *cwd;
+  size_t cwdlen;
+
+  if (__gconv_path_envvar == NULL)
     {
-      /* Determine the complete path first.  */
-      char *gconv_path;
-      size_t gconv_path_len;
-      char *elem;
-      char *oldp;
-      char *cp;
-      int nelems;
-      char *cwd;
-      size_t cwdlen;
-
-      if (__gconv_path_envvar == NULL)
-	{
-	  /* No user-defined path.  Make a modifiable copy of the
-	     default path.  */
-	  gconv_path = strdupa (default_gconv_path);
-	  gconv_path_len = sizeof (default_gconv_path);
-	  cwd = NULL;
-	  cwdlen = 0;
-	}
-      else
-	{
-	  /* Append the default path to the user-defined path.  */
-	  size_t user_len = strlen (__gconv_path_envvar);
-
-	  gconv_path_len = user_len + 1 + sizeof (default_gconv_path);
-	  gconv_path = alloca (gconv_path_len);
-	  __mempcpy (__mempcpy (__mempcpy (gconv_path, __gconv_path_envvar,
-					   user_len),
-				":", 1),
-		     default_gconv_path, sizeof (default_gconv_path));
-	  cwd = __getcwd (NULL, 0);
-	  cwdlen = __glibc_unlikely (cwd == NULL) ? 0 : strlen (cwd);
-	}
-      assert (default_gconv_path[0] == '/');
-
-      /* In a first pass we calculate the number of elements.  */
-      oldp = NULL;
-      cp = strchr (gconv_path, ':');
-      nelems = 1;
-      while (cp != NULL)
-	{
-	  if (cp != oldp + 1)
-	    ++nelems;
-	  oldp = cp;
-	  cp =  strchr (cp + 1, ':');
-	}
-
-      /* Allocate the memory for the result.  */
-      result = (struct path_elem *) malloc ((nelems + 1)
-					    * sizeof (struct path_elem)
-					    + gconv_path_len + nelems
-					    + (nelems - 1) * (cwdlen + 1));
-      if (result != NULL)
-	{
-	  char *strspace = (char *) &result[nelems + 1];
-	  int n = 0;
-
-	  /* Separate the individual parts.  */
-	  __gconv_max_path_elem_len = 0;
-	  elem = __strtok_r (gconv_path, ":", &gconv_path);
-	  assert (elem != NULL);
-	  do
-	    {
-	      result[n].name = strspace;
-	      if (elem[0] != '/')
-		{
-		  assert (cwd != NULL);
-		  strspace = __mempcpy (strspace, cwd, cwdlen);
-		  *strspace++ = '/';
-		}
-	      strspace = __stpcpy (strspace, elem);
-	      if (strspace[-1] != '/')
-		*strspace++ = '/';
-
-	      result[n].len = strspace - result[n].name;
-	      if (result[n].len > __gconv_max_path_elem_len)
-		__gconv_max_path_elem_len = result[n].len;
-
-	      *strspace++ = '\0';
-	      ++n;
-	    }
-	  while ((elem = __strtok_r (NULL, ":", &gconv_path)) != NULL);
-
-	  result[n].name = NULL;
-	  result[n].len = 0;
-	}
+      /* No user-defined path.  Make a modifiable copy of the
+         default path.  */
+      gconv_path = strdupa (default_gconv_path);
+      gconv_path_len = sizeof (default_gconv_path);
+      cwd = NULL;
+      cwdlen = 0;
+    }
+  else
+    {
+      /* Append the default path to the user-defined path.  */
+      size_t user_len = strlen (__gconv_path_envvar);
+
+      gconv_path_len = user_len + 1 + sizeof (default_gconv_path);
+      gconv_path = alloca (gconv_path_len);
+      __mempcpy (__mempcpy (__mempcpy (gconv_path, __gconv_path_envvar,
+                                       user_len),
+                            ":", 1),
+                 default_gconv_path, sizeof (default_gconv_path));
+      cwd = __getcwd (NULL, 0);
+      cwdlen = __glibc_unlikely (cwd == NULL) ? 0 : strlen (cwd);
+    }
+  assert (default_gconv_path[0] == '/');
 
-      __gconv_path_elem = result ?: (struct path_elem *) &empty_path_elem;
+  /* In a first pass we calculate the number of elements.  */
+  oldp = NULL;
+  cp = strchr (gconv_path, ':');
+  nelems = 1;
+  while (cp != NULL)
+    {
+      if (cp != oldp + 1)
+        ++nelems;
+      oldp = cp;
+      cp = strchr (cp + 1, ':');
+    }
 
-      free (cwd);
+  /* Allocate the memory for the result.  */
+  result = malloc ((nelems + 1)
+                              * sizeof (struct path_elem)
+                              + gconv_path_len + nelems
+                              + (nelems - 1) * (cwdlen + 1));
+  if (result != NULL)
+    {
+      char *strspace = (char *) &result[nelems + 1];
+      int n = 0;
+
+      /* Separate the individual parts.  */
+      __gconv_max_path_elem_len = 0;
+      elem = __strtok_r (gconv_path, ":", &gconv_path);
+      assert (elem != NULL);
+      do
+        {
+          result[n].name = strspace;
+          if (elem[0] != '/')
+            {
+              assert (cwd != NULL);
+              strspace = __mempcpy (strspace, cwd, cwdlen);
+              *strspace++ = '/';
+            }
+          strspace = __stpcpy (strspace, elem);
+          if (strspace[-1] != '/')
+            *strspace++ = '/';
+
+          result[n].len = strspace - result[n].name;
+          if (result[n].len > __gconv_max_path_elem_len)
+            __gconv_max_path_elem_len = result[n].len;
+
+          *strspace++ = '\0';
+          ++n;
+        }
+      while ((elem = __strtok_r (NULL, ":", &gconv_path)) != NULL);
+
+      result[n].name = NULL;
+      result[n].len = 0;
     }
 
-  __libc_lock_unlock (lock);
+  __gconv_path_elem = result ?: (struct path_elem *) &empty_path_elem;
+
+  free (cwd);
 }
 
 
 /* Read all configuration files found in the user-specified and the default
-   path.  */
-void
-attribute_hidden
+   path.  This function should only be called once during the program's
+   lifetime.  It disregards locking and synchronization because its only
+   caller, __gconv_load_conf, handles this.  */
+static void
 __gconv_read_conf (void)
 {
   void *modules = NULL;
@@ -609,6 +605,20 @@  __gconv_read_conf (void)
 }
 
 
+/* This "once" variable is used to do a one-time load of the configuration.  */
+__libc_once_define (static, once);
+
+
+/* Read all configuration files found in the user-specified and the default
+   path, but do it only "once" using __gconv_read_conf to do the actual
+   work.  This is the function that must be called when reading iconv
+   configuration.  */
+void
+__gconv_load_conf (void)
+{
+  __libc_once (once, __gconv_read_conf);
+}
+
 
 /* Free all resources if necessary.  */
 libc_freeres_fn (free_mem)
diff --git a/iconv/gconv_db.c b/iconv/gconv_db.c
index 66e095d8c7..86acfc7d74 100644
--- a/iconv/gconv_db.c
+++ b/iconv/gconv_db.c
@@ -687,10 +687,6 @@  find_derivation (const char *toset, const char *toset_expand,
 }
 
 
-/* Control of initialization.  */
-__libc_once_define (static, once);
-
-
 static const char *
 do_lookup_alias (const char *name)
 {
@@ -709,7 +705,7 @@  __gconv_compare_alias (const char *name1, const char *name2)
   int result;
 
   /* Ensure that the configuration data is read.  */
-  __libc_once (once, __gconv_read_conf);
+  __gconv_load_conf ();
 
   if (__gconv_compare_alias_cache (name1, name2, &result) != 0)
     result = strcmp (do_lookup_alias (name1) ?: name1,
@@ -729,7 +725,7 @@  __gconv_find_transform (const char *toset, const char *fromset,
   int result;
 
   /* Ensure that the configuration data is read.  */
-  __libc_once (once, __gconv_read_conf);
+  __gconv_load_conf ();
 
   /* Acquire the lock.  */
   __libc_lock_lock (__gconv_lock);
diff --git a/iconv/gconv_int.h b/iconv/gconv_int.h
index 45e47a6511..dcdb1bce76 100644
--- a/iconv/gconv_int.h
+++ b/iconv/gconv_int.h
@@ -178,8 +178,8 @@  extern int __gconv_compare_alias_cache (const char *name1, const char *name2,
 extern void __gconv_release_step (struct __gconv_step *step)
      attribute_hidden;
 
-/* Read all the configuration data and cache it.  */
-extern void __gconv_read_conf (void) attribute_hidden;
+/* Read all the configuration data and cache it if not done so already.  */
+extern void __gconv_load_conf (void) attribute_hidden;
 
 /* Try to read module cache file.  */
 extern int __gconv_load_cache (void) attribute_hidden;
diff --git a/iconv/tst-iconv-mt.c b/iconv/tst-iconv-mt.c
new file mode 100644
index 0000000000..aa91a980e5
--- /dev/null
+++ b/iconv/tst-iconv-mt.c
@@ -0,0 +1,142 @@ 
+/* Test that iconv works in a multi-threaded program.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This test runs several worker threads that perform the following three
+   steps in staggered synchronization with each other:
+   1. initialization (iconv_open)
+   2. conversion (iconv)
+   3. cleanup (iconv_close)
+   Having several threads synchronously (using pthread_barrier_*) perform
+   these routines exercises iconv code that handles concurrent attempts to
+   initialize and/or read global iconv state (namely configuration).  */
+
+#include <iconv.h>
+#include <stdio.h>
+#include <string.h>
+
+#include <support/support.h>
+#include <support/xthread.h>
+#include <support/check.h>
+
+#define TCOUNT 16
+_Static_assert (TCOUNT %2 == 0,
+                "thread count must be even, since we need two groups.");
+
+
+#define CONV_INPUT "Hello, iconv!"
+
+
+pthread_barrier_t sync;
+pthread_barrier_t sync_half_pool;
+
+
+/* Execute iconv_open, iconv and iconv_close in a synchronized way in
+   conjunction with other sibling worker threads.  If any step fails, print
+   an error to stdout and return NULL to the main thread to indicate the
+   error.  */
+static void *
+worker (void * arg)
+{
+  long int tidx = (long int) arg;
+
+  iconv_t cd;
+
+  char ascii[] = CONV_INPUT;
+  char *inbufpos = ascii;
+  size_t inbytesleft = sizeof (CONV_INPUT);
+
+  char *utf8 = xcalloc (sizeof (CONV_INPUT), 1);
+  char *outbufpos = utf8;
+  size_t outbytesleft = sizeof (CONV_INPUT);
+
+  if (tidx < TCOUNT/2)
+    /* The first half of the worker thread pool synchronize together here,
+       then call iconv_open immediately after.  */
+    xpthread_barrier_wait (&sync_half_pool);
+  else
+    /* The second half wait for the first half to finish iconv_open and
+       continue to the next barrier (before the call to iconv below).  */
+    xpthread_barrier_wait (&sync);
+
+  /* The above block of code staggers all subsequent pthread_barrier_wait
+     calls in a way that ensures a high chance of encountering these
+     combinations of concurrent iconv usage:
+     1) concurrent calls to iconv_open,
+     2) concurrent calls to iconv_open *and* iconv,
+     3) concurrent calls to iconv,
+     4) concurrent calls to iconv *and* iconv_close,
+     5) concurrent calls to iconv_close.  */
+
+  cd = iconv_open ("UTF8", "ASCII");
+  TEST_VERIFY_EXIT (cd != (iconv_t) -1);
+
+  xpthread_barrier_wait (&sync);
+
+  TEST_VERIFY_EXIT (iconv (cd, &inbufpos, &inbytesleft, &outbufpos,
+                           &outbytesleft)
+                    != (size_t) -1);
+
+  *outbufpos = '\0';
+
+  xpthread_barrier_wait (&sync);
+
+  TEST_VERIFY_EXIT (iconv_close (cd) == 0);
+
+  /* The next conditional barrier wait is needed because we staggered the
+     threads into two groups in the beginning and at this point, the second
+     half of worker threads are waiting for the first half to finish
+     iconv_close before they can executing the same:  */
+  if (tidx < TCOUNT/2)
+    xpthread_barrier_wait (&sync);
+
+  if (strncmp (utf8, CONV_INPUT, sizeof CONV_INPUT))
+    {
+      printf ("FAIL: thread %lx: invalid conversion output from iconv\n", tidx);
+      pthread_exit ((void *) (long int) 1);
+    }
+
+  pthread_exit (NULL);
+}
+
+
+static int
+do_test (void)
+{
+  pthread_t thread[TCOUNT];
+  void * worker_output;
+  int i;
+
+  xpthread_barrier_init (&sync, NULL, TCOUNT);
+  xpthread_barrier_init (&sync_half_pool, NULL, TCOUNT/2);
+
+  for (i = 0; i < TCOUNT; i++)
+    thread[i] = xpthread_create (NULL, worker, (void *) (long int) i);
+
+  for (i = 0; i < TCOUNT; i++)
+    {
+      worker_output = xpthread_join (thread[i]);
+      TEST_VERIFY_EXIT (worker_output == NULL);
+    }
+
+  xpthread_barrier_destroy (&sync);
+  xpthread_barrier_destroy (&sync_half_pool);
+
+  return 0;
+}
+
+#include <support/test-driver.c>