[1/4] system_data_types.7: Add '__int128'

Message ID 20201001163443.106933-2-colomar.6.4.3@gmail.com
State Not applicable
Headers
Series Document 128-bit types |

Commit Message

Alejandro Colomar Oct. 1, 2020, 4:34 p.m. UTC
  Signed-off-by: Alejandro Colomar <colomar.6.4.3@gmail.com>
---
 man7/system_data_types.7 | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)
  

Comments

Florian Weimer Oct. 2, 2020, 11:47 a.m. UTC | #1
* Alejandro Colomar via Libc-alpha:

> +the compiler is able to generate efficient code for 128-bit arithmetic".

Stray "?

> +.PP
> +See also the
> +.IR intmax_t ,
> +.IR int N _t

I think it's surprising that intmax_t can be smaller than __int128 (and
usually is, I think), so that's probably worth mentioning.  Assuming
that we want to document __int128 at all as part of the man-pages
project.

Thanks,
Florian
  
Paul Eggert Oct. 2, 2020, 5:52 p.m. UTC | #2
Why describe __int128_t in these man pages at all? __int128_t is not a property 
of either the kernel or of glibc, so it's out of scope.
  
Alejandro Colomar Oct. 2, 2020, 7:01 p.m. UTC | #3
Hi Paul,

On 2020-10-02 19:52, Paul Eggert wrote:
> Why describe __int128_t in these man pages at all? __int128_t is not a 
> property of either the kernel or of glibc, so it's out of scope.

Well, as I see it, [unsigned] __int128 is as good as [u]int64_t.
They are part of the C interface in Linux.
As a programmer I never cared if it was
Glibc providing me the type with a typedef,
or GCC providing me the type with its magic.

If you propose not to document the stdint types either,
I can see your position, but I'll disagree.



A few personal lines:

I just want to use it, and to do that,
I need a reference manual where to look for how to use it.
And belive me that unsigned __int128 has been very useful to me.

I think having this page with most of the types,
in a centralized manner, is exactly
what I would have needed in the past.
I have had trouble finding where ssize_t was defined.
I could have looked at the POSIX manual,
and I would have easily found that it is defined in <sys/types.h>,
but I didn't even know that it was a POSIX thing
(and I can tell that I'm not the only one who didn't know this :),
so I didn't even know where to search for it.

When I wanted to use unsigned __128, kind of the same thing,
where do you search for the documentation of something
that you don't even know who specified it?

When that happens, the first thing for me is to 'man something'.
If that doesn't give me any useful info, then duckduckgo something.

In the internet there's much info,
but also much of it is incomplete or incorrect,
so if I have the man, I trust the man over anything else
(except for the standard documents, of course).

But the standards documents usually provide the information
in a reverse fashion:
If you know where to look at, you'll find it.
But if you only know its name, it'll be hard to find where it is.

The man provides documentation with the name of what you want to
know about.  Simple and easy.

And man is faster than the internet :)


Regards,

Alex.
  
Paul Eggert Oct. 2, 2020, 7:54 p.m. UTC | #4
On 10/2/20 12:01 PM, Alejandro Colomar wrote:
> If you propose not to document the stdint types either,

This is not a stdint.h issue. __int128 is not in stdint.h and is not a system 
data type in any real sense; it's purely a compiler issue. Besides, do we start 
repeating the GCC manual too, while we're at it? At some point we need to 
restrain ourselves and stay within the scope of the man pages.

PS. Have you ever tried to use __int128 in real code? I have, to my regret. It's 
a portability and bug minefield and should not be used unless you really know 
what you're doing, which most people do not.
  
Alejandro Colomar Oct. 2, 2020, 8:03 p.m. UTC | #5
Hi Paul,

On 2020-10-02 21:54, Paul Eggert wrote:
 > On 10/2/20 12:01 PM, Alejandro Colomar wrote:
 >> If you propose not to document the stdint types either,
 >
 > This is not a stdint.h issue. __int128 is not in stdint.h and is not a
 > system data type in any real sense; it's purely a compiler issue.
 > Besides, do we start repeating the GCC manual too, while we're at it? At
 > some point we need to restrain ourselves and stay within the scope of
 > the man pages.

I know it's not in stdint,
but I mean that it behaves as any other stdint type.
So I see value in having them documented together in the same page.
And it's very useful in some (very specific) cases
where portability isn't in mind
(although many compilers are starting to provide this type).

 >
 > PS. Have you ever tried to use __int128 in real code? I have, to my
 > regret. It's a portability and bug minefield and should not be used
 > unless you really know what you're doing, which most people do not.

I have.
I used unsigned __int128, for operating on a big bit matrix.
This type helped me remove a whole abstraction for the columns:

unsigned __int128 mat[128];	// This is a 128x128 bit matrix.

This way I avoided either having to use a shorter type,
which would have been a bit weird:

uint64_t	mat[128][2];	// This is more complicated to see.

Or having to use GMP,
which would have also complicated unnecessarily my code.

And of course, I didn't care about portability,
because I just wanted it to work.

Thanks,

Alex
  
Paul Eggert Oct. 2, 2020, 8:19 p.m. UTC | #6
On 10/2/20 1:03 PM, Alejandro Colomar wrote:
> I know it's not in stdint,
> but I mean that it behaves as any other stdint type.

It doesn't. There's no portable way to use scanf and printf on it. You can't 
reliably convert it to intmax_t. It doesn't have the associated _MIN and _MAX 
macros that the stdint types do. It's a completely different animal.

If all you need are a few bit-twiddling tricks on x86-64, it should work. But 
watch out if you try to do something fancy, like multiply or divide or read or 
print or atomics. There's a good reason it's excluded from intmax_t.
  
Alejandro Colomar Oct. 2, 2020, 11:44 p.m. UTC | #7
Hi Paul,

On 2020-10-02 22:19, Paul Eggert wrote:
 > On 10/2/20 1:03 PM, Alejandro Colomar wrote:
 >> I know it's not in stdint,
 >> but I mean that it behaves as any other stdint type.

With caveats, of course.

 >
 > It doesn't. There's no portable way to use scanf and printf on it.

I didn't need to.  Yes that's a problem.
It may be possible to write a custom specifier for printf,
but I didn't try.  I wrote one for printing binary,
and it's not that difficult.

If you really need it, this might help:

https://github.com/alejandro-colomar/libalx/blob/d193b5648834c135824a5ba68d0ffcd2d38155a8/src/base/stdio/printf/b.c

 > You can't reliably convert it to intmax_t.

Well, intmax_t isn't really that useful.
I see it more like a generic type, than an actual type.

I guess you could have

typedef __int128 intwidest_t;

if you find it's useful to you.


 > It doesn't have the associated _MIN and _MAX macros
 > that the stdint types do. It's a completely different animal.

Those are really easy to write.
For my use cases, they have been enough.
These might be useful to you:


#define UINT128_C(c)	((uint128_t)c)
#define INT128_C(c)	(( int128_t)c)
#define UINT128_MAX	((uint128_t)~UINT128_C(0))
#define INT128_MAX	(( int128_t)(UINT128_MAX >> 1))
#define INT128_MIN	(( int128_t)(-INT128_MAX - 1))


 >
 > If all you need are a few bit-twiddling tricks on x86-64, it should
 > work. But watch out if you try to do something fancy, like multiply or
 > divide or read or print or atomics. There's a good reason it's excluded
 > from intmax_t.

I know, they aren't perfect.
But they are still very useful,
and don't see a good reason to not document them.

Cheers,

Alex
  
Paul Eggert Oct. 2, 2020, 11:53 p.m. UTC | #8
On 10/2/20 4:44 PM, Alejandro Colomar wrote:

> I know, they aren't perfect.
> But they are still very useful,
> and don't see a good reason to not document them.

"aren't perfect" is an understatement....

More important, lots of things in GNU C are useful but shouldn't be documented 
in the man pages, because they're out of scope. (The syntax of GNU C strings, 
for example.) The man pages are not intended to be a guide to every feature of 
GNU C. There is the GNU C manual for that, and people can read that.
  
Florian Weimer Oct. 5, 2020, 7:12 a.m. UTC | #9
* Paul Eggert:

> On 10/2/20 12:01 PM, Alejandro Colomar wrote:
>> If you propose not to document the stdint types either,
>
> This is not a stdint.h issue. __int128 is not in stdint.h and is not a
> system data type in any real sense; it's purely a compiler
> issue. Besides, do we start repeating the GCC manual too, while we're
> at it? At some point we need to restrain ourselves and stay within the
> scope of the man pages.

The manual pages also duplicate the glibc manual, and as far as I know,
it's what programmers actually read.  (Downstream, we receive many more
man-pages bugs than glibc or GCC manual bugs.)  Most developers use
distributions which do not ship the glibc or GCC manual for licensing
policy reasons, so the GNU manuals are not installed locally.

> PS. Have you ever tried to use __int128 in real code? I have, to my
> regret. It's a portability and bug minefield and should not be used
> unless you really know what you're doing, which most people do not.

Doesn't this suggest we need improved documentation?

Thanks,
Florian
  
Michael Kerrisk \(man-pages\) Oct. 7, 2020, 6:53 a.m. UTC | #10
Hi Florian,

On 10/5/20 9:12 AM, Florian Weimer wrote:
> * Paul Eggert:
> 
>> On 10/2/20 12:01 PM, Alejandro Colomar wrote:
>>> If you propose not to document the stdint types either,
>>
>> This is not a stdint.h issue. __int128 is not in stdint.h and is not a
>> system data type in any real sense; it's purely a compiler
>> issue. Besides, do we start repeating the GCC manual too, while we're
>> at it? At some point we need to restrain ourselves and stay within the
>> scope of the man pages.
> 
> The manual pages also duplicate the glibc manual, and as far as I know,
> it's what programmers actually read.  (Downstream, we receive many more
> man-pages bugs than glibc or GCC manual bugs.)  Most developers use
> distributions 

I presume by most you mean "Debian + Ubuntu"? (The certainly
reflects what I see.)

> which do not ship the glibc or GCC manual for licensing
> policy reasons, so the GNU manuals are not installed locally.

I hadn't quite clicked to this point. So, Debian (and thus
Ubuntu) do not ship the glibc manual because of GNU FDL. That's 
unfortunate. (Many years ago, IIRC, there were still one of
two GNU FDL licensed pages in man-pages, and I rewrote / removed
the pages to eliminate the problem for Debian--and I was happy also
at the same time to reduce the number of licenses in man-pages.)

>> PS. Have you ever tried to use __int128 in real code? I have, to my
>> regret. It's a portability and bug minefield and should not be used
>> unless you really know what you're doing, which most people do not.
> 
> Doesn't this suggest we need improved documentation?

Well, yes. But I'm also not convinced man-pages is the right
place for it.

Thanks,

Michael
  
Florian Weimer Oct. 7, 2020, 6:57 a.m. UTC | #11
* Michael Kerrisk:

> Hi Florian,
>
> On 10/5/20 9:12 AM, Florian Weimer wrote:
>> * Paul Eggert:
>> 
>>> On 10/2/20 12:01 PM, Alejandro Colomar wrote:
>>>> If you propose not to document the stdint types either,
>>>
>>> This is not a stdint.h issue. __int128 is not in stdint.h and is not a
>>> system data type in any real sense; it's purely a compiler
>>> issue. Besides, do we start repeating the GCC manual too, while we're
>>> at it? At some point we need to restrain ourselves and stay within the
>>> scope of the man pages.
>> 
>> The manual pages also duplicate the glibc manual, and as far as I know,
>> it's what programmers actually read.  (Downstream, we receive many more
>> man-pages bugs than glibc or GCC manual bugs.)  Most developers use
>> distributions 
>
> I presume by most you mean "Debian + Ubuntu"? (The certainly
> reflects what I see.)

Yes, exactly.  And other distributions inspired by the Debian
interpretation of the DFSG.

>> which do not ship the glibc or GCC manual for licensing
>> policy reasons, so the GNU manuals are not installed locally.
>
> I hadn't quite clicked to this point. So, Debian (and thus
> Ubuntu) do not ship the glibc manual because of GNU FDL. That's 
> unfortunate.

From Debian's point of view, only GFDL plus Invariant Sections is
problematic, but both the glibc and GCC manuals have them.  Plain GFDL
would still be awkward for upstream, but fine from a policy standpoint.

Thanks,
Florian
  

Patch

diff --git a/man7/system_data_types.7 b/man7/system_data_types.7
index e545aa1a0..5f9aa648f 100644
--- a/man7/system_data_types.7
+++ b/man7/system_data_types.7
@@ -40,6 +40,8 @@  system_data_types \- overview of system data types
 .\"		* Description (no "Description" header)
 .\"			A few lines describing the type.
 .\"
+.\"		* Versions (optional)
+.\"
 .\"		* Conforming to (see NOTES)
 .\"			Format: CXY and later; POSIX.1-XXXX and later.
 .\"
@@ -48,6 +50,44 @@  system_data_types \- overview of system data types
 .\"		* Bugs (if any)
 .\"
 .\"		* See also
+.\"------------------------------------- __int128 ---------------------/
+.TP
+.I __int128
+.RS
+.RI [ signed ]
+.I __int128
+.PP
+A signed integer type
+of a fixed width of exactly 128 bits.
+.PP
+When using GCC,
+it is supported only for targets where
+the compiler is able to generate efficient code for 128-bit arithmetic".
+.PP
+Versions:
+GCC 4.6.0 and later.
+.PP
+Conforming to:
+This is a non-standard extension, present in GCC.
+It is not standardized by the C language standard nor POSIX.
+.PP
+Notes:
+This type is available without including any header.
+.PP
+Bugs:
+It is not possible to express an integer constant of type
+.I __int128
+in implementations where
+.I long long
+is less than 128 bits wide.
+.PP
+See also the
+.IR intmax_t ,
+.IR int N _t
+and
+.I unsigned __int128
+types in this page.
+.RE
 .\"------------------------------------- aiocb ------------------------/
 .TP
 .I aiocb