Patchwork bz1311954 - multilib variations in LC_COLLATE files, with fixes

login
register
mail settings
Submitter DJ Delorie
Date March 21, 2019, 4:09 a.m.
Message ID <xnpnqkuauy.fsf@greed.delorie.com>
Download mbox | patch
Permalink /patch/31927/
State New
Headers show

Comments

DJ Delorie - March 21, 2019, 4:09 a.m.
"Carlos O'Donell" <codonell@redhat.com> writes:
> Very minor. Thanks, I think this means that using 50% is the simplest
> and easiest option then. Care to post patches that use 50% with >> ?

iconv, localedef: avoid floating point rounding differences

Two cases of "int * 1.4" may result in imprecise results, which
in at least one case resulted in i686 and x86-64 producing
different locale files.  This replaced that floating point multiply
with integer operations.  While the hash table margin is increased
from 40% to 50%, testing shows only 2% increase in overall size
of the locale archive.

https://bugzilla.redhat.com/show_bug.cgi?id=1311954
Carlos O'Donell - March 21, 2019, 4:23 a.m.
On 3/21/19 12:09 AM, DJ Delorie wrote:
> "Carlos O'Donell" <codonell@redhat.com> writes:
>> Very minor. Thanks, I think this means that using 50% is the simplest
>> and easiest option then. Care to post patches that use 50% with >> ?
> 
> iconv, localedef: avoid floating point rounding differences

This is a user-visible bug and so needs a bugzilla entry, please create
one, and reference the ml discussion, and reference it here.

OK for master with changelog that refs the bug # and commit first line
that refs the bug #.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> Two cases of "int * 1.4" may result in imprecise results, which
> in at least one case resulted in i686 and x86-64 producing
> different locale files.  This replaced that floating point multiply
> with integer operations.  While the hash table margin is increased
> from 40% to 50%, testing shows only 2% increase in overall size
> of the locale archive.

OK.

> https://bugzilla.redhat.com/show_bug.cgi?id=1311954
> 
> diff --git a/iconv/iconvconfig.c b/iconv/iconvconfig.c
> index 0201450742..1e6066cdf0 100644
> --- a/iconv/iconvconfig.c
> +++ b/iconv/iconvconfig.c
> @@ -1079,9 +1079,9 @@ write_output (void)
>   
>     /* Create the hashing table.  We know how many strings we have.
>        Creating a perfect hash table is not reasonable here.  Therefore
> -     we use open hashing and a table size which is the next prime 40%
> +     we use open hashing and a table size which is the next prime 50%

OK.

>        larger than the number of strings.  */
> -  hash_size = next_prime (nnames * 1.4);
> +  hash_size = next_prime (nnames + nnames >> 1);

OK.

>     hash_table = (struct hash_entry *) xcalloc (hash_size,
>   					      sizeof (struct hash_entry));
>     /* Fill the hash table.  */
> diff --git a/locale/programs/ld-collate.c b/locale/programs/ld-collate.c
> index bb4e2c539d..19b23c2453 100644
> --- a/locale/programs/ld-collate.c
> +++ b/locale/programs/ld-collate.c
> @@ -2401,8 +2401,8 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap,
>   
>         runp = runp->next;
>       }
> -  /* Add 40% and find the next prime number.  */
> -  elem_size = next_prime (elem_size * 1.4);
> +  /* Add 50% and find the next prime number.  */
> +  elem_size = next_prime (elem_size + elem_size >> 1);

OK.

>   
>     /* Allocate the table.  Each entry consists of two words: the hash
>        value and an index in a secondary table which provides the index
>
Florian Weimer - March 21, 2019, 8:52 a.m.
* Carlos O'Donell:

> On 3/21/19 12:09 AM, DJ Delorie wrote:
>> "Carlos O'Donell" <codonell@redhat.com> writes:
>>> Very minor. Thanks, I think this means that using 50% is the simplest
>>> and easiest option then. Care to post patches that use 50% with >> ?
>> 
>> iconv, localedef: avoid floating point rounding differences
>
> This is a user-visible bug and so needs a bugzilla entry, please create
> one, and reference the ml discussion, and reference it here.

I brought this one up on the libstdc++ list as well:

  <https://gcc.gnu.org/ml/libstdc++/2019-03/msg00088.html>

I think with the C++ hashtables, the impact is mostly visible with the
iteration order.
H.J. Lu - April 10, 2019, 5:52 p.m.
On Wed, Mar 20, 2019 at 9:09 PM DJ Delorie <dj@redhat.com> wrote:
>
> "Carlos O'Donell" <codonell@redhat.com> writes:
> > Very minor. Thanks, I think this means that using 50% is the simplest
> > and easiest option then. Care to post patches that use 50% with >> ?
>
> iconv, localedef: avoid floating point rounding differences
>
> Two cases of "int * 1.4" may result in imprecise results, which
> in at least one case resulted in i686 and x86-64 producing
> different locale files.  This replaced that floating point multiply
> with integer operations.  While the hash table margin is increased
> from 40% to 50%, testing shows only 2% increase in overall size
> of the locale archive.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1311954
>
> diff --git a/iconv/iconvconfig.c b/iconv/iconvconfig.c
> index 0201450742..1e6066cdf0 100644
> --- a/iconv/iconvconfig.c
> +++ b/iconv/iconvconfig.c
> @@ -1079,9 +1079,9 @@ write_output (void)
>
>    /* Create the hashing table.  We know how many strings we have.
>       Creating a perfect hash table is not reasonable here.  Therefore
> -     we use open hashing and a table size which is the next prime 40%
> +     we use open hashing and a table size which is the next prime 50%
>       larger than the number of strings.  */
> -  hash_size = next_prime (nnames * 1.4);
> +  hash_size = next_prime (nnames + nnames >> 1);
>    hash_table = (struct hash_entry *) xcalloc (hash_size,
>                                               sizeof (struct hash_entry));
>    /* Fill the hash table.  */
> diff --git a/locale/programs/ld-collate.c b/locale/programs/ld-collate.c
> index bb4e2c539d..19b23c2453 100644
> --- a/locale/programs/ld-collate.c
> +++ b/locale/programs/ld-collate.c
> @@ -2401,8 +2401,8 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap,
>
>        runp = runp->next;
>      }
> -  /* Add 40% and find the next prime number.  */
> -  elem_size = next_prime (elem_size * 1.4);
> +  /* Add 50% and find the next prime number.  */
> +  elem_size = next_prime (elem_size + elem_size >> 1);
>
>    /* Allocate the table.  Each entry consists of two words: the hash
>       value and an index in a secondary table which provides the index

This caused:

https://sourceware.org/bugzilla/show_bug.cgi?id=24442
Carlos O'Donell - May 6, 2019, 6:40 p.m.
On 4/10/19 1:52 PM, H.J. Lu wrote:
> On Wed, Mar 20, 2019 at 9:09 PM DJ Delorie <dj@redhat.com> wrote:
>>
>> "Carlos O'Donell" <codonell@redhat.com> writes:
>>> Very minor. Thanks, I think this means that using 50% is the simplest
>>> and easiest option then. Care to post patches that use 50% with >> ?
>>
>> iconv, localedef: avoid floating point rounding differences
>>
>> Two cases of "int * 1.4" may result in imprecise results, which
>> in at least one case resulted in i686 and x86-64 producing
>> different locale files.  This replaced that floating point multiply
>> with integer operations.  While the hash table margin is increased
>> from 40% to 50%, testing shows only 2% increase in overall size
>> of the locale archive.
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1311954
>>
>> diff --git a/iconv/iconvconfig.c b/iconv/iconvconfig.c
>> index 0201450742..1e6066cdf0 100644
>> --- a/iconv/iconvconfig.c
>> +++ b/iconv/iconvconfig.c
>> @@ -1079,9 +1079,9 @@ write_output (void)
>>
>>     /* Create the hashing table.  We know how many strings we have.
>>        Creating a perfect hash table is not reasonable here.  Therefore
>> -     we use open hashing and a table size which is the next prime 40%
>> +     we use open hashing and a table size which is the next prime 50%
>>        larger than the number of strings.  */
>> -  hash_size = next_prime (nnames * 1.4);
>> +  hash_size = next_prime (nnames + nnames >> 1);
>>     hash_table = (struct hash_entry *) xcalloc (hash_size,
>>                                                sizeof (struct hash_entry));
>>     /* Fill the hash table.  */
>> diff --git a/locale/programs/ld-collate.c b/locale/programs/ld-collate.c
>> index bb4e2c539d..19b23c2453 100644
>> --- a/locale/programs/ld-collate.c
>> +++ b/locale/programs/ld-collate.c
>> @@ -2401,8 +2401,8 @@ collate_output (struct localedef_t *locale, const struct charmap_t *charmap,
>>
>>         runp = runp->next;
>>       }
>> -  /* Add 40% and find the next prime number.  */
>> -  elem_size = next_prime (elem_size * 1.4);
>> +  /* Add 50% and find the next prime number.  */
>> +  elem_size = next_prime (elem_size + elem_size >> 1);
>>
>>     /* Allocate the table.  Each entry consists of two words: the hash
>>        value and an index in a secondary table which provides the index
> 
> This caused:
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=24442
  
The test is faulty and prone to spurious failure even if the ABI
doesn't change. It checks for binary identical results, but that
isn't true if you change the size of extra hash entries used by
the algorithm. Such a change is not ABI-incompatible. The test
should be updated to actually test the ABI (a harder test to write).

Patch

diff --git a/iconv/iconvconfig.c b/iconv/iconvconfig.c
index 0201450742..1e6066cdf0 100644
--- a/iconv/iconvconfig.c
+++ b/iconv/iconvconfig.c
@@ -1079,9 +1079,9 @@  write_output (void)
 
   /* Create the hashing table.  We know how many strings we have.
      Creating a perfect hash table is not reasonable here.  Therefore
-     we use open hashing and a table size which is the next prime 40%
+     we use open hashing and a table size which is the next prime 50%
      larger than the number of strings.  */
-  hash_size = next_prime (nnames * 1.4);
+  hash_size = next_prime (nnames + nnames >> 1);
   hash_table = (struct hash_entry *) xcalloc (hash_size,
 					      sizeof (struct hash_entry));
   /* Fill the hash table.  */
diff --git a/locale/programs/ld-collate.c b/locale/programs/ld-collate.c
index bb4e2c539d..19b23c2453 100644
--- a/locale/programs/ld-collate.c
+++ b/locale/programs/ld-collate.c
@@ -2401,8 +2401,8 @@  collate_output (struct localedef_t *locale, const struct charmap_t *charmap,
 
       runp = runp->next;
     }
-  /* Add 40% and find the next prime number.  */
-  elem_size = next_prime (elem_size * 1.4);
+  /* Add 50% and find the next prime number.  */
+  elem_size = next_prime (elem_size + elem_size >> 1);
 
   /* Allocate the table.  Each entry consists of two words: the hash
      value and an index in a secondary table which provides the index