[RFC] execve.2: SYNOPSIS: Document both glibc wrapper and kernel sycalls

Message ID 20210214133907.157320-1-alx.manpages@gmail.com
State Not applicable
Headers
Series [RFC] execve.2: SYNOPSIS: Document both glibc wrapper and kernel sycalls |

Commit Message

Alejandro Colomar Feb. 14, 2021, 1:39 p.m. UTC
  Until now, the manual pages have (usually) documented only either
the glibc (or another library) wrapper for a syscall, or the raw
syscall (this only when there's not a wrapper).

Let's document both prototypes, which many times are slightly
different.  This will solve a problem where documenting glibc
wrappers implied shadowing the documentation for the raw syscall.

It will also be much clearer for the reader where the syscall
comes from (kernel? glibc? other?), by adding an explicit comment
at the beginning of the prototypes.  This removes the need of
scrolling down to NOTES to see that info.

Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
---

Hi all,

This is a prototype for doing some important changes to the SYNOPSIS
of the man-pages.

The commit message above explains the idea quite well.  A few details
that couldn't be shown on this commit are:

For cases where the wrapper is provided by a library other than glibc,
I'd simply change the comment.  For example, for move_pages(2),
it would say /* libnuma wrapper function: */.

I think this would make the samll notes warning that there's no glibc
wrapper function deprecated (but we could keep them for some time and
decide that later).

While changing this, I'd also make sure that the headers are correct,
and clearly differentiate which headers are needed for the raw syscall
and for the wrapper function.

This change will probably take more than one release of the man-pages
to complete.

Any thoughts?

Thanks,

Alex

---
 man2/execve.2 | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)
  

Comments

Michael Kerrisk \(man-pages\) Feb. 18, 2021, 12:27 p.m. UTC | #1
Hi Alex,

On 2/14/21 2:39 PM, Alejandro Colomar wrote:
> Until now, the manual pages have (usually) documented only either
> the glibc (or another library) wrapper for a syscall, or the raw
> syscall (this only when there's not a wrapper).
> 
> Let's document both prototypes, which many times are slightly
> different.  This will solve a problem where documenting glibc
> wrappers implied shadowing the documentation for the raw syscall.
> 
> It will also be much clearer for the reader where the syscall
> comes from (kernel? glibc? other?), by adding an explicit comment
> at the beginning of the prototypes.  This removes the need of
> scrolling down to NOTES to see that info.
> 
> Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
> ---
> 
> Hi all,
> 
> This is a prototype for doing some important changes to the SYNOPSIS
> of the man-pages.
> 
> The commit message above explains the idea quite well.  A few details
> that couldn't be shown on this commit are:
> 
> For cases where the wrapper is provided by a library other than glibc,
> I'd simply change the comment.  For example, for move_pages(2),
> it would say /* libnuma wrapper function: */.
> 
> I think this would make the samll notes warning that there's no glibc
> wrapper function deprecated (but we could keep them for some time and
> decide that later).
> 
> While changing this, I'd also make sure that the headers are correct,
> and clearly differentiate which headers are needed for the raw syscall
> and for the wrapper function.
> 
> This change will probably take more than one release of the man-pages
> to complete.
> 
> Any thoughts?

My first impression is that I'm not keen on this. We'll add extra
text to all Section 2 pages, and in many (most?) cases the info
will be redundant (i.e., the wrapper and the syscall() notation
will express the same info). In other cases, I suspect the info
will be largely irrelevant to the user. To take an example: to 
whom will the difference that you document below for execve()
matter, how will it matter, and does it matter enough that we
headline the info in the pages? I'd want cogent answers to
those questions before considering a wide-ranging change.

There are indeed cases where the wrapper API differs in
significant ways from the syscall API (and these differences
are usually captured in the " C library/kernel differences"
subsections, such as for pselect()/pselect6() in select(2)).
But I imagine that that is the case in only a smallish
minority of the pages.

And indeed there are a very few syscalls that have wrappers
provided in another library. But it's a very small percentage
I think, and best documented case by case in specific pages.
The default presumption is that the wrapper is in the C library.

There are other cases where I think it may be worthwhile
considering the syscall() notation:

1. Where the system call has no wrapper. In that case, we might
   use the syscall() notation in the SYNOPISIS as both
   (a) a clear indication that there is no wrapper and
   (b) instructions to the reader about how to call the
   system call using syscall().

2. In cases where there is a "significant" difference between
   the wrapper and the system call. In this case, we might
   also place the syscall() notation in the SYNOPSIS, or
   (perhaps more likely) in the NOTES

Thanks,

Michael

> 
> Thanks,
> 
> Alex
> 
> ---
>  man2/execve.2 | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/man2/execve.2 b/man2/execve.2
> index 639e3b4b9..87ff022ce 100644
> --- a/man2/execve.2
> +++ b/man2/execve.2
> @@ -39,10 +39,18 @@
>  execve \- execute program
>  .SH SYNOPSIS
>  .nf
> +/* Glibc wrapper function: */
>  .B #include <unistd.h>
>  .PP
> -.BI "int execve(const char *" pathname ", char *const " argv [],
> -.BI "           char *const " envp []);
> +.BI "int execve(const char *" pathname ",
> +.BI "           char *const " argv "[], char *const " envp []);
> +.PP
> + /* Raw system call: */
> +.B #include <sys/syscall.h>
> +.B #include <unistd.h>
> +.PP
> +.BI "int syscall(SYS_execve, const char *" pathname ,
> +.BI "           const char *const " argv "[], const char *const " envp []);
>  .fi
>  .SH DESCRIPTION
>  .BR execve ()
>
  
Alejandro Colomar Feb. 18, 2021, 2:01 p.m. UTC | #2
Hi Micahel,

On 2/18/21 1:27 PM, Michael Kerrisk (man-pages) wrote:
> Hi Alex,
> 
> On 2/14/21 2:39 PM, Alejandro Colomar wrote:
>> Until now, the manual pages have (usually) documented only either
>> the glibc (or another library) wrapper for a syscall, or the raw
>> syscall (this only when there's not a wrapper).
>>
>> Let's document both prototypes, which many times are slightly
>> different.  This will solve a problem where documenting glibc
>> wrappers implied shadowing the documentation for the raw syscall.
>>
>> It will also be much clearer for the reader where the syscall
>> comes from (kernel? glibc? other?), by adding an explicit comment
>> at the beginning of the prototypes.  This removes the need of
>> scrolling down to NOTES to see that info.
>>
>> Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
>> ---
>>
>> Hi all,
>>
>> This is a prototype for doing some important changes to the SYNOPSIS
>> of the man-pages.
>>
>> The commit message above explains the idea quite well.  A few details
>> that couldn't be shown on this commit are:
>>
>> For cases where the wrapper is provided by a library other than glibc,
>> I'd simply change the comment.  For example, for move_pages(2),
>> it would say /* libnuma wrapper function: */.
>>
>> I think this would make the samll notes warning that there's no glibc
>> wrapper function deprecated (but we could keep them for some time and
>> decide that later).
>>
>> While changing this, I'd also make sure that the headers are correct,
>> and clearly differentiate which headers are needed for the raw syscall
>> and for the wrapper function.
>>
>> This change will probably take more than one release of the man-pages
>> to complete.
>>
>> Any thoughts?
> 
> My first impression is that I'm not keen on this. We'll add extra
> text to all Section 2 pages, and in many (most?) cases the info
> will be redundant (i.e., the wrapper and the syscall() notation
> will express the same info). In other cases, I suspect the info
> will be largely irrelevant to the user. To take an example: to
> whom will the difference that you document below for execve()
> matter, how will it matter, and does it matter enough that we
> headline the info in the pages? I'd want cogent answers to
> those questions before considering a wide-ranging change.

It will matter to:

1) Users of old systems where the glibc wrapper is not yet present.

3) Users of some unicorn Linux distributions that use a C library 
different than glibc and may not have wrappers for some syscalls that 
glibc provides.

2) Library (libc) developers.

Those won't have the glibc wrapper available for them, and will have to 
use syscall(2).  The kernel syscall info would be highly valuable for 
them.  However, the sum of them is probably not a big number of people.


> 
> There are indeed cases where the wrapper API differs in
> significant ways from the syscall API (and these differences
> are usually captured in the " C library/kernel differences"
> subsections, such as for pselect()/pselect6() in select(2)).
> But I imagine that that is the case in only a smallish
> minority of the pages.
> 
> And indeed there are a very few syscalls that have wrappers
> provided in another library. But it's a very small percentage
> I think, and best documented case by case in specific pages.
> The default presumption is that the wrapper is in the C library.

Agree.

> 
> There are other cases where I think it may be worthwhile
> considering the syscall() notation:
> 
> 1. Where the system call has no wrapper. In that case, we might
>     use the syscall() notation in the SYNOPISIS as both
>     (a) a clear indication that there is no wrapper and
>     (b) instructions to the reader about how to call the
>     system call using syscall().

Yes.

> 
> 2. In cases where there is a "significant" difference between
>     the wrapper and the system call. In this case, we might
>     also place the syscall() notation in the SYNOPSIS, or
>     (perhaps more likely) in the NOTES

Yes.

I think it would be equally good to have the kernel syscall prototype in 
"C library/kernel ABI differences" in those cases where there is a glibc 
wrapper (even if it's quite different).  It would be even better, as it 
would clearly mark the syscall(2) method as a second-class method, that 
should be avoided if possible.  And also wouldn't add lines to the SYNOPSIS.

However, we should probably have that subsection for all syscalls, 
including those where the prototype is very similar to the glibc one, to 
support those who need to use the kernel syscall, and provide them with 
the exact types that the kernel expects.(except for those unsupported by 
libraries, of course, which would have the syscall(SYS_xxx) prototype in 
the SYNOPSIS).

I'll prepare a new RFC with this, with 2 pages:  one with wrapper and 
one without wrapper.

Thanks,

Alex


See also:
<https://lwn.net/Articles/534682/>
<https://www.kernel.org/doc/man-pages/todo.html#migrate_to_kernel_source>


> 
> Thanks,
> 
> Michael
> 
>>
>> Thanks,
>>
>> Alex
>>
>> ---
>>   man2/execve.2 | 12 ++++++++++--
>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/man2/execve.2 b/man2/execve.2
>> index 639e3b4b9..87ff022ce 100644
>> --- a/man2/execve.2
>> +++ b/man2/execve.2
>> @@ -39,10 +39,18 @@
>>   execve \- execute program
>>   .SH SYNOPSIS
>>   .nf
>> +/* Glibc wrapper function: */
>>   .B #include <unistd.h>
>>   .PP
>> -.BI "int execve(const char *" pathname ", char *const " argv [],
>> -.BI "           char *const " envp []);
>> +.BI "int execve(const char *" pathname ",
>> +.BI "           char *const " argv "[], char *const " envp []);
>> +.PP
>> + /* Raw system call: */
>> +.B #include <sys/syscall.h>
>> +.B #include <unistd.h>
>> +.PP
>> +.BI "int syscall(SYS_execve, const char *" pathname ,
>> +.BI "           const char *const " argv "[], const char *const " envp []);
>>   .fi
>>   .SH DESCRIPTION
>>   .BR execve ()
>>
> 
>
  

Patch

diff --git a/man2/execve.2 b/man2/execve.2
index 639e3b4b9..87ff022ce 100644
--- a/man2/execve.2
+++ b/man2/execve.2
@@ -39,10 +39,18 @@ 
 execve \- execute program
 .SH SYNOPSIS
 .nf
+/* Glibc wrapper function: */
 .B #include <unistd.h>
 .PP
-.BI "int execve(const char *" pathname ", char *const " argv [],
-.BI "           char *const " envp []);
+.BI "int execve(const char *" pathname ",
+.BI "           char *const " argv "[], char *const " envp []);
+.PP
+ /* Raw system call: */
+.B #include <sys/syscall.h>
+.B #include <unistd.h>
+.PP
+.BI "int syscall(SYS_execve, const char *" pathname ,
+.BI "           const char *const " argv "[], const char *const " envp []);
 .fi
 .SH DESCRIPTION
 .BR execve ()