Host endian independence

Message ID	18c8a820-b2c4-8ab2-58d1-8d8c851dbf01@zamaudio.com
State	Dropped
Headers	Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk Sender: libc-alpha-owner@sourceware.org To: libc-alpha <libc-alpha@sourceware.org> From: Damien Zammit <damien@zamaudio.com> Subject: Host endian independence Openpgp: preference=signencrypt Message-ID: <18c8a820-b2c4-8ab2-58d1-8d8c851dbf01@zamaudio.com> Date: Sat, 24 Aug 2019 14:13:55 +1000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------A97BB7D826EE3F72B02F77EF"

Message ID

18c8a820-b2c4-8ab2-58d1-8d8c851dbf01@zamaudio.com

State

Dropped

Headers

Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
Sender: libc-alpha-owner@sourceware.org
To: libc-alpha <libc-alpha@sourceware.org>
From: Damien Zammit <damien@zamaudio.com>
Subject: Host endian independence
Openpgp: preference=signencrypt
Message-ID: <18c8a820-b2c4-8ab2-58d1-8d8c851dbf01@zamaudio.com>
Date: Sat, 24 Aug 2019 14:13:55 +1000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
	Thunderbird/60.7.0
MIME-Version: 1.0
Content-Type: multipart/mixed;
	boundary="------------A97BB7D826EE3F72B02F77EF"

Commit Message

Damien Zammit Aug. 24, 2019, 4:13 a.m. UTC

  Hi there,

Given that GCC is smart enough to optimize bitshifts of bytes to casts
followed by bswaps if necessary on machines that have such instructions,
I propose to remove most (if not all) dependence on BYTE_ORDER in glibc
so that the same code can be written for any host - this will fix some
edge cases for middle endian machines and make the code easier to read.

There does not seem to be any performance hit (at least on x86
achitectures) by making the code host endian independent by choosing a
particular endian for streams and interpreting all streams in the
correct endian using bitshifts when streams are 4-byte aligned.
(0x11 << 8) | 0x22 = 0x1122 on any host endian machine.

Please see attached patches for proof of concept, which remove
dependence on BYTE_ORDER for two subfolders of glibc.

If this is a desirable change, please let me know and I can continue
this work.

Thanks,
Damien Zammit

Comments

Joseph Myers Aug. 27, 2019, 3:54 p.m. UTC | #1

On Sat, 24 Aug 2019, Damien Zammit wrote:

> Please see attached patches for proof of concept, which remove
> dependence on BYTE_ORDER for two subfolders of glibc.

I don't think these changes are appropriate.

I think we should make more use of the *existing* byte-swap interfaces, 
such as be32toh and be64toh in <endian.h>, rather than inventing new ones.  
By using those interfaces, tzfile.c, for example, could lose some of its 
existing endian checks (that would be a very small local change to the 
implementations of the decode and decode64 functions, larger changes are 
not needed and make the code less clean because the logical information 
that certain data is stored in the files in big-endian format is best kept 
local to the implementations of those two functions, rather than 
hardcoding that information in lots of places with read_be32 and read_be64 
names).  (I'm not convinced that any changes in this area beyond very 
minimal use of bswap_32 would improve the catgets code.)

Damien Zammit Aug. 29, 2019, 4:45 a.m. UTC | #2

Hi Joseph,

Thanks for your reply.

My goal is to introduce the endian-helpers and remove all code relating to byte-swapping,
then the existing byte-swapping interfaces may not be needed and could potentially be removed.

On 28/8/19 1:54 am, Joseph Myers wrote:
> I don't think these changes are appropriate.
> 
> I think we should make more use of the *existing* byte-swap interfaces, 
> such as be32toh and be64toh in <endian.h>, rather than inventing new ones.

Firstly, endian-helpers.h is not a byte-swapping interface, it is a collection of functions
that read/write streams in desired endian, which is something currently missing from glibc.  
I can see that your preference is to reuse existing byte-swap interfaces.
However, I am suggesting that byte-swapping, in general, is unnecessary and a kludge.
It also makes it very confusing for someone reading the code which endian a stream is
stored in when you are swapping based on the host machine's byte order.
Knowing the endianness of a stream in advance gives you the ability to write
one parsing function for each integer type within a stream that works on any host
and the compiler can optimize it.  The endianness of the stream can (and should only)
be handled at the interface where the stream is being converted from bytes to integer types
and vice versa, rather than having structs containing conditionally swapped bytes.

> By using those interfaces, tzfile.c, for example, could lose some of its 
> existing endian checks (that would be a very small local change to the 
> implementations of the decode and decode64 functions, larger changes are 
> not needed and make the code less clean because the logical information 
> that certain data is stored in the files in big-endian format is best kept 
> local to the implementations of those two functions, rather than 
> hardcoding that information in lots of places with read_be32 and read_be64 
> names).

If a stream coming from a file is stored in big endian, why not be explicit in
the naming of functions used to decode it so that it is clear which endian it is?

> ... (I'm not convinced that any changes in this area beyond very 
> minimal use of bswap_32 would improve the catgets code.)

I mostly agree, but I picked off two directories that were easy targets for my
host endian independent demonstration.  The purpose was to keep the code mostly the same,
*except* you can see there is now no more dependence on host BYTE_ORDER at all in the code.

--
Damien Zammit

Joseph Myers Aug. 29, 2019, 4:43 p.m. UTC | #3

On Thu, 29 Aug 2019, Damien Zammit wrote:

> Firstly, endian-helpers.h is not a byte-swapping interface, it is a 
> collection of functions that read/write streams in desired endian, which 
> is something currently missing from glibc.

That reads like simply a personal preference for arranging things 
differently, which isn't a good basis for changing how things are done 
away from very well established and widely used Unix style (functions such 
as htonl; interfaces such as htobe32 are simply a more modern version of 
that, avoiding the historical "network" and "long" concepts implicit in 
the older names).  Avoiding such a header at all means we can have a much 
smaller patch that genuinely improves consistency (reduces the number of 
explicit tests of endianness in glibc) rather than changing things for one 
person's individual taste and means we don't need to go through all the 
many deviations in that header from GNU coding style.

> I can see that your preference is to reuse existing byte-swap 
> interfaces. However, I am suggesting that byte-swapping, in general, is 
> unnecessary and a kludge.

glibc isn't application code.  It's system-level C, written for a 
particular profile of common-usage architectures and ABIs.  In 
system-level C, expressing things in terms of converting between host byte 
order and a particular byte-order for an external interface is entirely 
appropriate.  And interfaces such as htobe32 and be32toh make clear 
exactly what conversions are taking place.

Furthermore, the implementations of those interfaces use __builtin_bswap*, 
which are exactly the appropriate idioms for such byte-swapping in GNU C.

A plausible alternative would be to use the scalar_storage_order attribute 
(added to GCC in 2015, so available in all GCC versions now supported for 
building glibc) - subject to checking how well it is optimized compared to 
the present code (e.g. if a field gets used multiple times, does the 
compiler optimize that to only doing a single endian conversion on it?); 
that could eliminate the need for any explicit endian conversions at all 
in some places.  But that would require careful consideration to gain 
consensus before actually introducing any uses into glibc.

> > By using those interfaces, tzfile.c, for example, could lose some of its 
> > existing endian checks (that would be a very small local change to the 
> > implementations of the decode and decode64 functions, larger changes are 
> > not needed and make the code less clean because the logical information 
> > that certain data is stored in the files in big-endian format is best kept 
> > local to the implementations of those two functions, rather than 
> > hardcoding that information in lots of places with read_be32 and read_be64 
> > names).
> 
> If a stream coming from a file is stored in big endian, why not be 
> explicit in the naming of functions used to decode it so that it is 
> clear which endian it is?

General principles of encapsulation of information in one place.  There is 
one piece of information "tzfile format uses big-endian" that is most 
cleanly kept in one place (so avoiding typos from a single place 
accidentally using *_le32 instead of *_be32, for example).

Yann Droneaud Aug. 30, 2019, 9:16 a.m. UTC | #4

Hi,

Le jeudi 29 août 2019 à 14:45 +1000, Damien Zammit a écrit :
> 
> Thanks for your reply.
> 
> My goal is to introduce the endian-helpers and remove all code relating to byte-swapping,
> then the existing byte-swapping interfaces may not be needed and could potentially be removed.
> 
> On 28/8/19 1:54 am, Joseph Myers wrote:
> > I don't think these changes are appropriate.
> > 
> > I think we should make more use of the *existing* byte-swap interfaces, 
> > such as be32toh and be64toh in <endian.h>, rather than inventing new ones.
> 
> Firstly, endian-helpers.h is not a byte-swapping interface, it is a collection of functions
> that read/write streams in desired endian, which is something currently missing from glibc.  
> I can see that your preference is to reuse existing byte-swap interfaces.
> However, I am suggesting that byte-swapping, in general, is unnecessary and a kludge.
> It also makes it very confusing for someone reading the code which endian a stream is
> stored in when you are swapping based on the host machine's byte order.
> Knowing the endianness of a stream in advance gives you the ability to write
> one parsing function for each integer type within a stream that works on any host
> and the compiler can optimize it.  The endianness of the stream can (and should only)
> be handled at the interface where the stream is being converted from bytes to integer types
> and vice versa, rather than having structs containing conditionally swapped bytes.
> 
> > By using those interfaces, tzfile.c, for example, could lose some of its 
> > existing endian checks (that would be a very small local change to the 
> > implementations of the decode and decode64 functions, larger changes are 
> > not needed and make the code less clean because the logical information 
> > that certain data is stored in the files in big-endian format is best kept 
> > local to the implementations of those two functions, rather than 
> > hardcoding that information in lots of places with read_be32 and read_be64 
> > names).
> 
> If a stream coming from a file is stored in big endian, why not be explicit in
> the naming of functions used to decode it so that it is clear which endian it is?
> 
> > ... (I'm not convinced that any changes in this area beyond very 
> > minimal use of bswap_32 would improve the catgets code.)
> 
> I mostly agree, but I picked off two directories that were easy targets for my
> host endian independent demonstration.  The purpose was to keep the code mostly the same,
> *except* you can see there is now no more dependence on host BYTE_ORDER at all in the code.
> 

This remind me of the Rob Pike article "The byte order fallacy"
https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html

Regards.

Florian Weimer Aug. 30, 2019, 10:19 a.m. UTC | #5

* Joseph Myers:

> On Sat, 24 Aug 2019, Damien Zammit wrote:
>
>> Please see attached patches for proof of concept, which remove
>> dependence on BYTE_ORDER for two subfolders of glibc.
>
> I don't think these changes are appropriate.
>
> I think we should make more use of the *existing* byte-swap interfaces, 
> such as be32toh and be64toh in <endian.h>, rather than inventing new ones.  
> By using those interfaces, tzfile.c, for example, could lose some of its 
> existing endian checks (that would be a very small local change to the 
> implementations of the decode and decode64 functions, larger changes are 
> not needed and make the code less clean because the logical information 
> that certain data is stored in the files in big-endian format is best kept 
> local to the implementations of those two functions, rather than 
> hardcoding that information in lots of places with read_be32 and read_be64 
> names).  (I'm not convinced that any changes in this area beyond very 
> minimal use of bswap_32 would improve the catgets code.)

The problem with the existing interfaces is that that they make it hard
to take alignment issues into account.  See bug 20243.  When used for
parsing packet buffers, they also tend to introduce aliasing violations.

Nowadays, compilers should optimize multiple byte-wise reads into wide
load, possibly followed by a byte swap instruction, on targets that do
not require strict alignment.  (Or a cross-endian load if the target
supports that.)  In the past, it was really necessary for good
performance to have properly aligned buffers and parse them with an
aliasing violation, but that may no longer be necessary.

Obviously, packet parsing and generation should be done with a
higher-level abstraction and not these read/write functions, to enforce
error checking, but these functions could serve as a building block for
that.

Thanks,
Florian

Joseph Myers Aug. 30, 2019, 4:22 p.m. UTC | #6

On Fri, 30 Aug 2019, Florian Weimer wrote:

> The problem with the existing interfaces is that that they make it hard
> to take alignment issues into account.  See bug 20243.  When used for
> parsing packet buffers, they also tend to introduce aliasing violations.

That seems like a matter of declaring your structure appropriately (with 
the right alignment if e.g. it's being allocated on the stack, or with the 
packed attribute if it might genuinely be at arbitrary alignments; with 
the may_alias attribute if it's necessary for it to alias something with a 
different effective type).

The scalar_storage_order attribute could certainly be added to an 
appropriately declared structure to move the endianness information 
entirely into how the structure is defined and out of the code using the 
structure.

Damien Zammit Sept. 1, 2019, 9:26 a.m. UTC | #7

On 31/8/19 2:22 am, Joseph Myers wrote:
> The scalar_storage_order attribute could certainly be added to an 
> appropriately declared structure to move the endianness information 
> entirely into how the structure is defined and out of the code using the 
> structure.

I was not aware of this attribute until now, but looks like the best approach to me.
I might put together a new proof of concept and send in a new set of patches for review.
I'm not expecting it to be merged as is, as you said, Joseph, it would need 
to be carefully considered and tested - I think it would be good to have something to test.
Does libc-alpha have a mailing list convention for sending patches in that are WIP?

Florian Weimer wrote:
> The problem with the existing interfaces is that that they make it hard
> to take alignment issues into account.  See bug 20243.  When used for
> parsing packet buffers, they also tend to introduce aliasing violations.

I had a quick look at bug 20243, would it also be fixed by using the "packed" attribute and
Joseph's idea of putting the endianness info in the definition of the structs with the
"scalar_storage_order" attribute?

Florian Weimer wrote:
> Obviously, packet parsing and generation should be done with a
> higher-level abstraction and not these read/write functions, to enforce
> error checking, but these functions could serve as a building block for
> that.

Neither code like endian-helpers.h nor byte-swapping interfaces seem needed if we use this
attribute and packed structs.  Endianness issues can be solved cleanly and most byte-swapping
in the code can be removed.
Florian, what do you think of using a "scalar_storage_order" and "packed" approach?

Thanks for all the ideas and great answers.

--
Damien Zammit
damien AT zamaudio.com

Florian Weimer Sept. 2, 2019, 7:12 a.m. UTC | #8

* Damien Zammit:

> On 31/8/19 2:22 am, Joseph Myers wrote:
>> The scalar_storage_order attribute could certainly be added to an 
>> appropriately declared structure to move the endianness information 
>> entirely into how the structure is defined and out of the code using the 
>> structure.
>
> I was not aware of this attribute until now, but looks like the best
> approach to me.  I might put together a new proof of concept and send
> in a new set of patches for review.

You need to be careful where you use this features.  Some shared code
cannot use features which are GNU C only (not even the C++ frontend
supports it).  And it's always a bit risky to be an early adopter of
such compiler extensions.

> Florian Weimer wrote:
>> The problem with the existing interfaces is that that they make it hard
>> to take alignment issues into account.  See bug 20243.  When used for
>> parsing packet buffers, they also tend to introduce aliasing violations.
>
> I had a quick look at bug 20243, would it also be fixed by using the
> "packed" attribute and Joseph's idea of putting the endianness info in
> the definition of the structs with the "scalar_storage_order"
> attribute?

Note that either case will need a new struct definition.

Thanks,
Florian

Florian Weimer Sept. 2, 2019, 7:14 a.m. UTC | #9

* Joseph Myers:

> On Fri, 30 Aug 2019, Florian Weimer wrote:
>
>> The problem with the existing interfaces is that that they make it hard
>> to take alignment issues into account.  See bug 20243.  When used for
>> parsing packet buffers, they also tend to introduce aliasing violations.
>
> That seems like a matter of declaring your structure appropriately (with 
> the right alignment if e.g. it's being allocated on the stack, or with the 
> packed attribute if it might genuinely be at arbitrary alignments; with 
> the may_alias attribute if it's necessary for it to alias something with a 
> different effective type).

To some degree, yes, but I don't think the GCC internal type system
supports member alignment (and storage order, for the pragma), so you
need to be extremely careful when building abstractions around that.  It
doesn't help that all the common targets (from my perspective) do not
have alignment traps, so bugs are difficult to spot.

Thanks,
Florian

diff mbox

Patch

From 32a3f800b3977bb7c0826e231c89c8a242cc5e69 Mon Sep 17 00:00:00 2001
From: Damien Zammit <damien@zamaudio.com>
Date: Thu, 22 Aug 2019 20:15:51 +1000
Subject: [PATCH 3/3] catgets: Remove dependence on BYTE_ORDER

---
 catgets/gencat.c       | 42 +++++++++++++++++++-----------------------
 catgets/open_catalog.c | 31 ++++++++++++-------------------
 2 files changed, 31 insertions(+), 42 deletions(-)

diff --git a/catgets/gencat.c b/catgets/gencat.c
index f0d47ae0..e4b73e5d 100644
--- a/catgets/gencat.c
+++ b/catgets/gencat.c
@@ -22,7 +22,7 @@ 
 #include <argp.h>
 #include <assert.h>
 #include <ctype.h>
-#include <endian.h>
+#include <endian-helpers.h>
 #include <errno.h>
 #include <error.h>
 #include <fcntl.h>
@@ -44,10 +44,6 @@ 
 
 #include "catgetsinfo.h"
 
-
-#define SWAPU32(w) \
-  (((w) << 24) | (((w) & 0xff00) << 8) | (((w) >> 8) & 0xff00) | ((w) >> 24))
-
 struct message_list
 {
   int number;
@@ -852,7 +848,8 @@  write_out (struct catalog *catalog, const char *output_name,
   struct obstack string_pool;
   const char *strings;
   size_t strings_size;
-  uint32_t *array1, *array2;
+  uint32_t *array1;
+  uint8_t buf[8];
   size_t cnt;
   int fd;
 
@@ -932,8 +929,6 @@  write_out (struct catalog *catalog, const char *output_name,
   array1 =
     (uint32_t *) alloca (best_size * best_depth * sizeof (uint32_t) * 3);
   memset (array1, '\0', best_size * best_depth * sizeof (uint32_t) * 3);
-  array2
-    = (uint32_t *) alloca (best_size * best_depth * sizeof (uint32_t) * 3);
   obstack_init (&string_pool);
 
   set_run = catalog->all_sets;
@@ -969,10 +964,6 @@  write_out (struct catalog *catalog, const char *output_name,
   strings_size = obstack_object_size (&string_pool);
   strings = obstack_finish (&string_pool);
 
-  /* Compute ARRAY2 by changing the byte order.  */
-  for (cnt = 0; cnt < best_size * best_depth * 3; ++cnt)
-    array2[cnt] = SWAPU32 (array1[cnt]);
-
   /* Now we can write out the whole data.  */
   if (strcmp (output_name, "-") == 0
       || strcmp (output_name, "/dev/stdout") == 0)
@@ -986,19 +977,24 @@  write_out (struct catalog *catalog, const char *output_name,
     }
 
   /* Write out header.  */
-  write (fd, &obj, sizeof (obj));
+  write_le32 (buf, obj.magic);
+  write (fd, buf, sizeof (uint32_t));
+  write_le32 (buf, obj.plane_size);
+  write (fd, buf, sizeof (uint32_t));
+  write_le32 (buf, obj.plane_depth);
+  write (fd, buf, sizeof (uint32_t));
+  /* Nothing to write for obj.name_ptr - it is 0 bytes itself */
 
   /* We always write out the little endian version of the index
-     arrays.  */
-#if __BYTE_ORDER == __LITTLE_ENDIAN
-  write (fd, array1, best_size * best_depth * sizeof (uint32_t) * 3);
-  write (fd, array2, best_size * best_depth * sizeof (uint32_t) * 3);
-#elif __BYTE_ORDER == __BIG_ENDIAN
-  write (fd, array2, best_size * best_depth * sizeof (uint32_t) * 3);
-  write (fd, array1, best_size * best_depth * sizeof (uint32_t) * 3);
-#else
-# error Cannot handle __BYTE_ORDER byte order
-#endif
+     array (LE then BE) */
+  for (cnt = 0; cnt < best_size * best_depth * 3; ++cnt) {
+    write_le32 (buf, array1[cnt]);
+    write (fd, buf, sizeof (uint32_t));
+  }
+  for (cnt = 0; cnt < best_size * best_depth * 3; ++cnt) {
+    write_be32 (buf, array1[cnt]);
+    write (fd, buf, sizeof (uint32_t));
+  }
 
   /* Finally write the strings.  */
   write (fd, strings, strings_size);
diff --git a/catgets/open_catalog.c b/catgets/open_catalog.c
index 9aa6a09f..5c124073 100644
--- a/catgets/open_catalog.c
+++ b/catgets/open_catalog.c
@@ -16,8 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <byteswap.h>
-#include <endian.h>
+#include <endian-helpers.h>
 #include <errno.h>
 #include <fcntl.h>
 #include <string.h>
@@ -32,9 +31,6 @@ 
 #include <not-cancel.h>
 
 
-#define SWAPU32(w) bswap_32 (w)
-
-
 int
 __open_catalog (const char *cat_name, const char *nlspath, const char *env_var,
 		__nl_catd catalog)
@@ -255,11 +251,11 @@  __open_catalog (const char *cat_name, const char *nlspath, const char *env_var,
     }
 
   /* Determine whether the file is a catalog file and if yes whether
-     it is written using the correct byte order.  Else we have to swap
-     the values.  */
-  if (__glibc_likely (catalog->file_ptr->magic == CATGETS_MAGIC))
+     it is written using the correct byte order.  Else we have to read
+     the values in opposite endian.  */
+  if (__glibc_likely (read_le32 (&catalog->file_ptr->magic) == CATGETS_MAGIC))
     swapping = 0;
-  else if (catalog->file_ptr->magic == SWAPU32 (CATGETS_MAGIC))
+  else if (read_be32 (&catalog->file_ptr->magic) == CATGETS_MAGIC)
     swapping = 1;
   else
     {
@@ -275,23 +271,20 @@  __open_catalog (const char *cat_name, const char *nlspath, const char *env_var,
       goto close_unlock_return;
     }
 
-#define SWAP(x) (swapping ? SWAPU32 (x) : (x))
+#define READ32(x) (swapping ? read_be32 (x) : read_le32 (x))
 
   /* Get dimensions of the used hashing table.  */
-  catalog->plane_size = SWAP (catalog->file_ptr->plane_size);
-  catalog->plane_depth = SWAP (catalog->file_ptr->plane_depth);
+  catalog->plane_size = READ32 (&catalog->file_ptr->plane_size);
+  catalog->plane_depth = READ32 (&catalog->file_ptr->plane_depth);
 
   /* The file contains two versions of the pointer tables.  Pick the
      right one for the local byte order.  */
-#if __BYTE_ORDER == __LITTLE_ENDIAN
-  catalog->name_ptr = &catalog->file_ptr->name_ptr[0];
-#elif __BYTE_ORDER == __BIG_ENDIAN
-  catalog->name_ptr = &catalog->file_ptr->name_ptr[catalog->plane_size
+  if (__glibc_unlikely (swapping))
+    catalog->name_ptr = &catalog->file_ptr->name_ptr[catalog->plane_size
 						  * catalog->plane_depth
 						  * 3];
-#else
-# error Cannot handle __BYTE_ORDER byte order
-#endif
+  else
+    catalog->name_ptr = &catalog->file_ptr->name_ptr[0];
 
   /* The rest of the file contains all the strings.  They are
      addressed relative to the position of the first string.  */
-- 
2.13.1