[v9,0/13] implement dlmem() function

Message ID 20230318165110.3672749-1-stsp2@yandex.ru
Headers
Series implement dlmem() function |

Message

stsp March 18, 2023, 4:50 p.m. UTC
  Changes in v9:
- use "zero-copy" machinery instead of memcpy(). It works on linux 5.13
  and newer, falling back to memcpy() otherwise. Suggested by Florian Weimer.
- implement fdlopen() using the above functionality. It is in a new test
  tst-dlmem-fdlopen. Suggested by Carlos O'Donell.
- add DLMEM_DONTREPLACE flag that doesn't replace the backing-store mapping.
  It switches back to memcpy(). Test-case is called tst-dlmem-shm.

Changes in v8:
- drop audit machinery and instead add an extra arg (optional pointer
  to a struct) to dlmem() itself that allows to install a custom premap
  callback or to specify nsid. Audit machinery was meant to allow
  controling over the pre-existing APIs like dlopen(), but if someone
  ever needs such extensions to dlopen(), he can trivially implement
  dlopen() on top of dlmem().

Changes in v7:
- add _dl_audit_premap audit extension and its usage example

Changes in v6:
- use __strdup("") for l_name as suggested by Andreas Schwab

Changes in v5:
- added _dl_audit_premap_dlmem audit extension for dlmem
- added tst-auditmod-dlmem.c test-case that feeds shm fd to dlmem()

Changes in v4:
- re-target to GLIBC_2.38
- add tst-auditdlmem.c test-case to test auditing
- drop length page-aligning in tst-dlmem: mmap() aligns length on its own
- bugfix: in do_mmapcpy() allow mmaps past end of buffer

Changes in v3:
- Changed prototype of dlmem() (and all the internal machinery) to
  use "const unsigned char *buffer" instead of "const char *buffer".

Changes in v2:
- use <support/test-driver.c> instead of "../test-skeleton.c"
- re-target to GLIBC_2.37
- update all libc.abilist files
  

Comments

Adhemerval Zanella Netto March 29, 2023, 12:32 p.m. UTC | #1
On 18/03/23 13:50, Stas Sergeev via Libc-alpha wrote:
> Changes in v9:
> - use "zero-copy" machinery instead of memcpy(). It works on linux 5.13
>   and newer, falling back to memcpy() otherwise. Suggested by Florian Weimer.
> - implement fdlopen() using the above functionality. It is in a new test
>   tst-dlmem-fdlopen. Suggested by Carlos O'Donell.
> - add DLMEM_DONTREPLACE flag that doesn't replace the backing-store mapping.
>   It switches back to memcpy(). Test-case is called tst-dlmem-shm.

Hi Stas,

This discussion has been dragging without much conclusion while you have sent
multiple iterations of the same proposal without addressing the initial issues
Carlos, Jonathan, and myself have raised.

I will not focus on your specific usercase since it is quite singular and I 
think will not be applicable to a libc interface (mapping a VM memory with
different bit size and gluing all together should be an application domain
instead of bleeding out the interface to the libc).

Jonathan already summarized the main issue with dlmem:

  - dlmem() seems to to expect the user to parse the program headers and mmap()
  the binary as required. That requires the application to re-implement a core,
  delicate piece of ld.so... and do so correctly. From an API design perspective,
  that seems like a very poor choice of abstraction.

As well Carlos:

  * No guarantee that the memory that is loaded meets the preconditions for the
  dynamic loader and application uses.

And this alone seems to me that this interface is not easily composable and 
adds a lot of corner cases if the user does not provide an expected ELF image.
This is in par of what Carlos has raised [1] and IMHO you have not addresses the
issues, since you have focused on how this interface applies to your specific
problem instead of see who it fits as a generic libc interface (I will let
Carlos comment further).

As a generic interface I also don't see how this would really fit, it means
that process need to map a memory segment that was generated either through
a JIT or through an external process that can not provide a named file access
(due security limitation).  For these a fdlopen similar to BSD makes more
sense: for former memfd_create can be used, while for later either shm_open 
or passing the fd through unix sockets.

So I don't think dlmem really fits on glibc, although I might take a loot
on the internal code refactoring. I am ccing Rich Felker, creator of musl
libc, since he might have some additional insights whether this interface
make sense or not (and he usually have good ones).

[1] https://sourceware.org/pipermail/libc-alpha/2023-February/145735.html
  
stsp March 29, 2023, 1:10 p.m. UTC | #2
Hi Adhemerval, thanks for a follow-up!

29.03.2023 17:32, Adhemerval Zanella Netto пишет:
> This discussion has been dragging without much conclusion while you 
> have sent
> multiple iterations of the same proposal without addressing the initial issues
> Carlos, Jonathan, and myself have raised.
>
> I will not focus on your specific usercase since it is quite singular and I
> think will not be applicable to a libc interface (mapping a VM memory with
> different bit size and gluing all together should be an application domain
> instead of bleeding out the interface to the libc).

Yes, which is why I am trying to put in
the "widely" requested in the past dlmem()
function, which is quite agnostic to my use-case
but allows me to accomplish the task.


> Jonathan already summarized the main issue with dlmem:
>
>    - dlmem() seems to to expect the user to parse the program headers and mmap()
>    the binary as required. That requires the application to re-implement a core,
>    delicate piece of ld.so... and do so correctly. From an API design perspective,
>    that seems like a very poor choice of abstraction.
>
> As well Carlos:
>
>    * No guarantee that the memory that is loaded meets the preconditions for the
>    dynamic loader and application uses.
>
> And this alone seems to me that this interface is not easily composable and

OK, yes, I've heard all of you, but what
should I do? I can SWEAR that my test-cases
do not parse the elf headers. I can swear
that I have no idea what are you talking about.
Well, I did that already, but nothing helps. :(
So how should I proceed with the claim that
was said by everyone, and yet never confirmed
by a single fact?
I do not parse any elf headers, maybe glibc does,
I do not! If this is wrong, please point me
IN MY TEST-CASES that I actually do this.

> adds a lot of corner cases if the user does not provide an expected ELF image.
> This is in par of what Carlos has raised [1] and IMHO you have not addresses the
> issues, since you have focused on how this interface applies to your specific
> problem instead of see who it fits as a generic libc interface (I will let
> Carlos comment further).

I have not addressed that issue, I have
not seen that issue, I don't believe this
issue even exist.
What should I do?
Please, be constructive.
You can point it to me IF it is there.

> So I don't think dlmem really fits on glibc, although I might take a loot
> on the internal code refactoring. I am ccing Rich Felker, creator of musl
> libc, since he might have some additional insights whether this interface
> make sense or not (and he usually have good ones).
Thanks!
I appreciate the involvement of Rich Felker,
or any other activity to bring that discussion
further. But we need to get off the aforementioned
question around the elf headers.

Also please note that I have a v10 locally with
some fixes and new test-cases, so v9 should
not be reviewed quite literally.
  
Carlos O'Donell March 29, 2023, 1:17 p.m. UTC | #3
On 3/18/23 12:50, Stas Sergeev via Libc-alpha wrote:

A cover letter needs to explain in detail what the series does and why glibc
should include the series e.g. use cases, workloads.

Adding a new API is a serious thing for glibc since we will keep the new interface
for a long time, like the Linux kernel, and need to ensure that it is a generic
enough building block that we can use.

Please provide the details here about why glibc should incldue dlmem() and which
use cases it enables that existing APIs can't easily support.

> Changes in v9:
> - use "zero-copy" machinery instead of memcpy(). It works on linux 5.13
>   and newer, falling back to memcpy() otherwise. Suggested by Florian Weimer.
> - implement fdlopen() using the above functionality. It is in a new test
>   tst-dlmem-fdlopen. Suggested by Carlos O'Donell.
> - add DLMEM_DONTREPLACE flag that doesn't replace the backing-store mapping.
>   It switches back to memcpy(). Test-case is called tst-dlmem-shm.
> 
> Changes in v8:
> - drop audit machinery and instead add an extra arg (optional pointer
>   to a struct) to dlmem() itself that allows to install a custom premap
>   callback or to specify nsid. Audit machinery was meant to allow
>   controling over the pre-existing APIs like dlopen(), but if someone
>   ever needs such extensions to dlopen(), he can trivially implement
>   dlopen() on top of dlmem().
> 
> Changes in v7:
> - add _dl_audit_premap audit extension and its usage example
> 
> Changes in v6:
> - use __strdup("") for l_name as suggested by Andreas Schwab
> 
> Changes in v5:
> - added _dl_audit_premap_dlmem audit extension for dlmem
> - added tst-auditmod-dlmem.c test-case that feeds shm fd to dlmem()
> 
> Changes in v4:
> - re-target to GLIBC_2.38
> - add tst-auditdlmem.c test-case to test auditing
> - drop length page-aligning in tst-dlmem: mmap() aligns length on its own
> - bugfix: in do_mmapcpy() allow mmaps past end of buffer
> 
> Changes in v3:
> - Changed prototype of dlmem() (and all the internal machinery) to
>   use "const unsigned char *buffer" instead of "const char *buffer".
> 
> Changes in v2:
> - use <support/test-driver.c> instead of "../test-skeleton.c"
> - re-target to GLIBC_2.37
> - update all libc.abilist files
>
  
stsp March 29, 2023, 1:18 p.m. UTC | #4
Just to be more constructive, here is the
example:

void *dlopen_with_offset(const char *file, off_t offset, int flags)
{
     off_t len;
     void *addr;
     void *handle;
     int fd;

     fd = open(file, O_RDONLY);
     if (fd == -1)
         return NULL;
     len = lseek(fd, 0, SEEK_END);
     lseek(fd, 0, SEEK_SET);
     if (len <= offset)
       goto err_close;
     len -= offset;
     /* if offset unaligned then just use read() */
     if (offset & (PAGE_SIZE - 1)) {
         int rd;
         addr = mmap(NULL, len, PROT_READ | PROT_WRITE,
                     MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
         if (addr == MAP_FAILED)
             goto err_close;
         rd = read(fd, addr, len);
         if (rd != len) {
             munmap(addr, len);
             goto err_close;
         }
     } else {
         addr = mmap(NULL, len, PROT_READ, MAP_PRIVATE, fd, offset);
     }
     close(fd);
     if (addr == MAP_FAILED)
         return NULL;
     handle = dlmem(addr, len, flags, NULL);
     munmap(addr, len);
     return handle;

err_close:
     close(fd);
     return NULL;
}


Does it parse an elf headers?
If so - where?
  
stsp March 29, 2023, 1:26 p.m. UTC | #5
Hi,

29.03.2023 18:17, Carlos O'Donell пишет:
> On 3/18/23 12:50, Stas Sergeev via Libc-alpha wrote:
>
> A cover letter needs to explain in detail what the series does and why glibc
> should include the series e.g. use cases, workloads.
>
> Adding a new API is a serious thing for glibc since we will keep the new interface
> for a long time, like the Linux kernel, and need to ensure that it is a generic
> enough building block that we can use.
>
> Please provide the details here about why glibc should incldue dlmem() and which
> use cases it enables that existing APIs can't easily support.
OK, I'll post the extensive description
along with v10 soon, thanks!
  
stsp March 29, 2023, 5:03 p.m. UTC | #6
29.03.2023 18:17, Carlos O'Donell пишет:
> On 3/18/23 12:50, Stas Sergeev via Libc-alpha wrote:
>
> A cover letter needs to explain in detail what the series does and why glibc
> should include the series e.g. use cases, workloads.
Attaching a cover letter for the upcoming
v10. The patch-set is not yet completed so
please let me know if the cover is now
adequate or needs more descriptions.

Also it would be good if we settle on this
"elf header" stuff before v10 is released,
so I am eagerly waiting your reply on whether
the proof was enough or not.
I can try to bring you any proof you want,
but I can't be asked in a way "its possible
to create an elf that can break your impl,
please prove the opposite". Proving that
something can't be created, usually requires
infinite amount of time. So any proof you
want, needs to be finite and well-defined.
  
Carlos O'Donell March 29, 2023, 6:13 p.m. UTC | #7
On 3/29/23 13:03, stsp wrote:
> 
> 29.03.2023 18:17, Carlos O'Donell пишет:
>> On 3/18/23 12:50, Stas Sergeev via Libc-alpha wrote:
>>
>> A cover letter needs to explain in detail what the series does and why glibc
>> should include the series e.g. use cases, workloads.
> Attaching a cover letter for the upcoming
> v10. The patch-set is not yet completed so
> please let me know if the cover is now
> adequate or needs more descriptions.
> 
> Also it would be good if we settle on this
> "elf header" stuff before v10 is released,
> so I am eagerly waiting your reply on whether
> the proof was enough or not.
> I can try to bring you any proof you want,
> but I can't be asked in a way "its possible
> to create an elf that can break your impl,
> please prove the opposite". Proving that
> something can't be created, usually requires
> infinite amount of time. So any proof you
> want, needs to be finite and well-defined.

> From 46e5095ebfe63be4dcd813c4237d6a491a3f9768 Mon Sep 17 00:00:00 2001
> From: Stas Sergeev <stsp2@yandex.ru>
> Date: Mon, 13 Feb 2023 18:15:34 +0500
> Subject: [PATCH v10 0/12] implement dlmem() function

The most important thing is the reasons for the change and that should come first.

> Changes in v10:
> - squashed patch 1 as suggested by Adhemerval Zanella
> - fixed a few bugs in an elf relocation machinery after various hot discussions
> - added a new test tst-dlmem-dloff that demo-implements dlopen_with_offset()
> 
> Changes in v9:
> - use "zero-copy" machinery instead of memcpy(). It works on linux 5.13
>   and newer, falling back to memcpy() otherwise. Suggested by Florian Weimer.
> - implement fdlopen() using the above functionality. It is in a new test
>   tst-dlmem-fdlopen. Suggested by Carlos O'Donell.
> - add DLMEM_DONTREPLACE flag that doesn't replace the backing-store mapping.
>   It switches back to memcpy(). Test-case is called tst-dlmem-shm.
> 
> Changes in v8:
> - drop audit machinery and instead add an extra arg (optional pointer
>   to a struct) to dlmem() itself that allows to install a custom premap
>   callback or to specify nsid. Audit machinery was meant to allow
>   controling over the pre-existing APIs like dlopen(), but if someone
>   ever needs such extensions to dlopen(), he can trivially implement
>   dlopen() on top of dlmem().
> 
> Changes in v7:
> - add _dl_audit_premap audit extension and its usage example
> 
> Changes in v6:
> - use __strdup("") for l_name as suggested by Andreas Schwab
> 
> Changes in v5:
> - added _dl_audit_premap_dlmem audit extension for dlmem
> - added tst-auditmod-dlmem.c test-case that feeds shm fd to dlmem()
> 
> Changes in v4:
> - re-target to GLIBC_2.38
> - add tst-auditdlmem.c test-case to test auditing
> - drop length page-aligning in tst-dlmem: mmap() aligns length on its own
> - bugfix: in do_mmapcpy() allow mmaps past end of buffer
> 
> Changes in v3:
> - Changed prototype of dlmem() (and all the internal machinery) to
>   use "const unsigned char *buffer" instead of "const char *buffer".
> 
> Changes in v2:
> - use <support/test-driver.c> instead of "../test-skeleton.c"
> - re-target to GLIBC_2.37
> - update all libc.abilist files
> 
> This patch-set implements the dlmem() function that allows to load
> the solib from page-aligned memory buffer. It has lots of optional
> functionality for the fine-grained control over the loading process.
> The API looks as below:

This needs to explain the workload that requires the API.

Why that workload is generic.

Any core system library, like glibc, is not the place for *all* APIs, but the
place for the most generic building blocks for ISO C, POSIX, BSD, GNU, and
Linux APIs.
 
> /* Callback for dlmem. */
> typedef void *
> (dlmem_premap_t) (void *mappref, size_t maplength, size_t mapalign,
> 	          void *cookie);
> 
> /* Do not replace mapping created by premap callback.
>    dlmem() will then use memcpy(). */
> #define DLMEM_DONTREPLACE 1
> 
> struct dlmem_args {
>   /* Optional name to associate with the loaded object. */
>   const char *soname;
>   /* Namespace where to load the object. */
>   Lmid_t nsid;
>   /* dlmem-specific flags. */
>   unsigned flags;
>   /* Optional premap callback. */
>   dlmem_premap_t *premap;
>   /* Optional argument for premap callback. */
>   void *cookie;
> };
> 
> /* Like `dlmopen', but loads shared object from memory buffer.  */
> extern void *dlmem (const unsigned char *buffer, size_t size, int mode,
> 		    struct dlmem_args *dlm_args);
> 
> 
> In most cases dlm_args should just be set to NULL. It provides the
> advanced functionality, most of which is obvious (soname, nsid).
> The premap callback allows to set the relocation address for the solib.
> More so, if DLMEM_DONTREPLACE flag is used, then the mapping established
> by the premap callback, will not be replaced with the file-backed mapping.
> In that case dlmem() have to use memcpy(), which is likely even faster
> than mmaps() but doesn't end up with the proper /proc/self/map_files
> or /proc/self/maps entries. So for example if the premap callback uses
> MAP_SHARED, then with the use of the DLMEM_DONTREPLACE flag you can get
> your solib relocated into a shared memory buffer. Such functionality
> may be interesting for virtualized environments where the relocation
> address may have a special constraints, like eg MAP_32BIT.

It is not that the workload you have is unimportant, since importance is relative,
but the workload is not generic enough for glibc to add dlmem() into the library.

You need to have a very strong argument for the question "Why should this go into
the library?"
 
> -- 
> 2.37.2
>
  
stsp March 29, 2023, 6:29 p.m. UTC | #8
29.03.2023 23:13, Carlos O'Donell пишет:
> It is not that the workload you have is unimportant, since importance is relative,
> but the workload is not generic enough for glibc to add dlmem() into the library.
>
> You need to have a very strong argument for the question "Why should this go into
> the library?"
I'll write that down to the cover-letter,
but I'd like to also summarize it here to
get your opinion before posting the patch.

My view on that picture is that dlmem()
is a very generic API that allows to implement
eg dlopen_with_offset() in a dozen of lines.
And in that case - quite importantly -
no "advanced" features are even used, so
the last arg is just NULL.

Now the advanced features offered by the
last argument, are indeed quite specific to
my not-so-common use-case. So my idea
is exactly that: implement what other ppl
asked before, make it powerful and generic,
and then sneak in an optional pointer arg
for myself. :)
Is it a viable plan?
I also tried to "hide" the me-specific functionality
in an auditor - that plan was rejected as
non-viable. Optional pointer arg looks a
bit better to me. :)

So I am trying to put this not as "I want
to implement the function solely for myself",
but as "please let me have an optional
pointer arg in a very generic and powerful
function". :)
  
stsp March 31, 2023, 11:04 a.m. UTC | #9
Hi Carlos, Jonathon.

29.03.2023 23:13, Carlos O'Donell пишет:
> The most important thing is the reasons for the change and that should come first.

Done, please see the attachment.

> This needs to explain the workload that requires the API.
>
> Why that workload is generic.
>
> Any core system library, like glibc, is not the place for *all* APIs, but the
> place for the most generic building blocks for ISO C, POSIX, BSD, GNU, and
> Linux APIs.

I hope I've got that suggestion rightly.
Please see if my current description is
adequate.

And on the other front...
I studied and documented (in the attachment)
all the cases where my impl fails to arrange
an elf segments properly... I have to admit
such cases were possible. :(
I documented them and their mitigations in
the "Limitations" section.
Let me know if this is now adequate.
  
Szabolcs Nagy March 31, 2023, 12:20 p.m. UTC | #10
The 03/29/2023 18:18, stsp via Libc-alpha wrote:
> Just to be more constructive, here is the
> example:
> 
> void *dlopen_with_offset(const char *file, off_t offset, int flags)
> {

this api requires no libc change:

dump the contents to a temp file and use dlopen on it.


>     off_t len;
>     void *addr;
>     void *handle;
>     int fd;
> 
>     fd = open(file, O_RDONLY);
>     if (fd == -1)
>         return NULL;
>     len = lseek(fd, 0, SEEK_END);
>     lseek(fd, 0, SEEK_SET);
>     if (len <= offset)
>       goto err_close;
>     len -= offset;
>     /* if offset unaligned then just use read() */
>     if (offset & (PAGE_SIZE - 1)) {
>         int rd;
>         addr = mmap(NULL, len, PROT_READ | PROT_WRITE,
>                     MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
>         if (addr == MAP_FAILED)
>             goto err_close;
>         rd = read(fd, addr, len);
>         if (rd != len) {
>             munmap(addr, len);
>             goto err_close;
>         }
>     } else {
>         addr = mmap(NULL, len, PROT_READ, MAP_PRIVATE, fd, offset);
>     }
>     close(fd);
>     if (addr == MAP_FAILED)
>         return NULL;
>     handle = dlmem(addr, len, flags, NULL);
>     munmap(addr, len);
>     return handle;
> 
> err_close:
>     close(fd);
>     return NULL;
> }
> 
> 
> Does it parse an elf headers?
> If so - where?

it creates a mapping that's not compatible with dynamic loading
requirements *depending on* what the elf headers say.
(elf segments may be loaded at various offsets that can conflict
with the mapping that you just made)

but i'm not going to think about all the different ways this
is broken: you cut across abstraction layers exposing dynamic
loading internals where a temp file works fine. if there is
a performance problem (e.g. page sharing) likely that can be
solved too via kernel level apis, not ld.so.
  
stsp March 31, 2023, 1:51 p.m. UTC | #11
Hi Szabolcs, thanks for a follow-up.
Unfortunately I need to bypass your
comment, as all these claims were
1000 times refuted already on that
ML and also here:
https://sourceware.org/bugzilla/show_bug.cgi?id=30007

I'd be glad if you ask a concrete questions.
I want to answer and discuss things with
you or anyone else. Except that the "walking
in circles forever" scenario must be avoided.
  
stsp March 31, 2023, 2:44 p.m. UTC | #12
Hi Carlos.

The last letter from Szabolcs and a few
private e-mails I've got, suggest that this
"elf parsing" argument keeps raging in.

Carlos, you are the only person here who
asked me to _prove_ the problem is not
there, others just keep throwing it to me.

Carlos, you are the only person to whom
I presented such a proof. As the result, I
really need your answer on whether the
proof was satisfactory or not. Or if you
need some other proof - I need to know
about that. Getting rid of that "elf parsing"
argument seems to be possible only if you,
Carlos, permit that to happen. Otherwise
it doesn't matter what I do, I'll be always
thrown with that argument, and then we
can't progress...
  
Rich Felker March 31, 2023, 2:49 p.m. UTC | #13
On Fri, Mar 31, 2023 at 06:51:24PM +0500, stsp wrote:
> Hi Szabolcs, thanks for a follow-up.
> Unfortunately I need to bypass your
> comment, as all these claims were
> 1000 times refuted already on that
> ML and also here:
> https://sourceware.org/bugzilla/show_bug.cgi?id=30007
> 
> I'd be glad if you ask a concrete questions.
> I want to answer and discuss things with
> you or anyone else. Except that the "walking
> in circles forever" scenario must be avoided.

The "walking in circles forever scenario" can be avoided by dropping
this waste of everybody's time. You don't get to shoehorn pet
functionality into libc.

Rich
  
stsp March 31, 2023, 2:56 p.m. UTC | #14
31.03.2023 19:49, Rich Felker пишет:
> The "walking in circles forever scenario" can be avoided by dropping
> this waste of everybody's time. You don't get to shoehorn pet
> functionality into libc.

Thanks.
I attached the detailed description of the
API. Maybe if you read that, your statements
would be more concrete.

And no, this API doesn't require elf header
parsing by user, or anything else alike.
It always lays out an elf image per vaddr's.
I demonstrated that to Carlos, so if he
permits this false claim about my patch
to go away, then the argument would
disappear.

And no, it can't be replaced by memfd and
alike techniques - I proved that many times,
and at least Jonathon accepted that fact
after a month of discussions.

Now if I can convince some individuals in
some parts of the mosaic, I have no idea
how to convince everyone at once...
  
Rich Felker March 31, 2023, 2:58 p.m. UTC | #15
On Fri, Mar 31, 2023 at 07:56:41PM +0500, stsp wrote:
> 
> 31.03.2023 19:49, Rich Felker пишет:
> >The "walking in circles forever scenario" can be avoided by dropping
> >this waste of everybody's time. You don't get to shoehorn pet
> >functionality into libc.
> 
> Thanks.
> I attached the detailed description of the
> API. Maybe if you read that, your statements
> would be more concrete.
> 
> And no, this API doesn't require elf header
> parsing by user, or anything else alike.
> It always lays out an elf image per vaddr's.
> I demonstrated that to Carlos, so if he
> permits this false claim about my patch
> to go away, then the argument would
> disappear.
> 
> And no, it can't be replaced by memfd and
> alike techniques - I proved that many times,
> and at least Jonathon accepted that fact
> after a month of discussions.
> 
> Now if I can convince some individuals in
> some parts of the mosaic, I have no idea
> how to convince everyone at once...

I'm not going to read this any more than I'm going to watch a YouTube
video about free energy. It's been rehashed over and over why this
does not work. Please just stop.

Rich
  
stsp March 31, 2023, 3:03 p.m. UTC | #16
31.03.2023 19:58, Rich Felker пишет:
> I'm not going to read this any more than I'm going to watch a YouTube
> video about free energy. It's been rehashed over and over why this
> does not work. Please just stop.


Sorry but it works and lays out an
elf segments properly. My test-case has
a proof of that:


$ LD_LIBRARY_PATH=..:. ./tst-dlmem-fdlopen
unaligned buf gives buffer not aligned: Invalid argument
7fb413101000-7fb413102000 r--p 00000000 00:28 17195405 
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413102000-7fb413103000 r-xp 00001000 00:28 17195405 
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413103000-7fb413104000 r--p 00002000 00:28 17195405 
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413104000-7fb413105000 r--p 00002000 00:28 17195405 
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413105000-7fb413106000 rw-p 00003000 00:28 17195405 
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so


If it didn't work, there would be only
1 reference to the solib. But after dlmem,
in this test-case there are 5. This proves
that the elf was laid out correctly per
vaddr's.
  
stsp March 31, 2023, 3:12 p.m. UTC | #17
Hi Szabolcs, I need to try again with answering
this.

31.03.2023 17:20, Szabolcs Nagy пишет:
> The 03/29/2023 18:18, stsp via Libc-alpha wrote:
>> Just to be more constructive, here is the
>> example:
>>
>> void *dlopen_with_offset(const char *file, off_t offset, int flags)
>> {
> this api requires no libc change:
>
> dump the contents to a temp file and use dlopen on it.

This doesn't work for my use-case.
dlmem() can optionally preserve the destination
mapping, which is not possible by definition with
any file-based API.

> it creates a mapping that's not compatible with dynamic loading
> requirements *depending on* what the elf headers say.
> (elf segments may be loaded at various offsets that can conflict
> with the mapping that you just made)
No, conflict is not possible as they are
loaded to entirely separately mmapped buffer.
And I have a proof that this works as expected:

$ LD_LIBRARY_PATH=..:. ./tst-dlmem-fdlopen
unaligned buf gives buffer not aligned: Invalid argument
7fb413101000-7fb413102000 r--p 00000000 00:28 17195405 
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413102000-7fb413103000 r-xp 00001000 00:28 17195405 
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413103000-7fb413104000 r--p 00002000 00:28 17195405 
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413104000-7fb413105000 r--p 00002000 00:28 17195405 
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413105000-7fb413106000 rw-p 00003000 00:28 17195405 
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so


In this example you see 5 references to the
solib, all created by dlmem. There is no
conflict with the initial buffer, such conflict
is not even possible.
  
Szabolcs Nagy March 31, 2023, 5:12 p.m. UTC | #18
The 03/31/2023 20:12, stsp wrote:
> 31.03.2023 17:20, Szabolcs Nagy пишет:
> > The 03/29/2023 18:18, stsp via Libc-alpha wrote:
> > > Just to be more constructive, here is the
> > > example:
> > > 
> > > void *dlopen_with_offset(const char *file, off_t offset, int flags)
> > > {
> > this api requires no libc change:
> > 
> > dump the contents to a temp file and use dlopen on it.
> 
> This doesn't work for my use-case.
> dlmem() can optionally preserve the destination
> mapping, which is not possible by definition with
> any file-based API.

loading from special place and loading into special place
are different things.

you gave an example of the former but then talk about the
latter.

loading from special place can be solved by writing the
contents to a temp file and loading that.

loading into special place is difficult especially if you
want to control the details of segment mapping. in case of
your dlmem design it passes down a user callback that runs
under dynamic linker locks to map segments. the interface
contract of this callback is not documented, so it can
deadlock and exposes dynamic linker internals (makes
implementation changes harder later).

if all you want is force mapping below 4G then you should
have asked for an RTLD_ flag (which may not be useful
enough or may not be supportable on all relevant targets,
but at least does not need significant loader changes).
  
stsp March 31, 2023, 5:36 p.m. UTC | #19
Hi!

31.03.2023 22:12, Szabolcs Nagy пишет:
> loading from special place and loading into special place
> are different things.

Thank you for the technical comment!
Somehow that happened that this thread
went into non-technical very quickly. :(

> you gave an example of the former but then talk about the
> latter.

These are related.
File-based mmaps leave you no chance
of preserving the destination mapping,
so for my needs I took the interface for
which mmaps are not mandatory.
And then:
1. It suits my use-case surprisingly well
2. It suits as a base block for implementing
things like dlopen_with_offset(), which I
was planning to present in a subsequent
patch series. But given the fact that people
stopped believing my impl even works, I
am not so sure...

> loading from special place can be solved by writing the
> contents to a temp file and loading that.

That will end up with the casual mmaps(),
so loading to special place won't be achieved
as the result.

> loading into special place is difficult especially if you
> want to control the details of segment mapping.

Its difficult.
But my patch accumulated months of a work,
so its just here.


>   in case of
> your dlmem design it passes down a user callback that runs
> under dynamic linker locks to map segments. the interface
> contract of this callback is not documented, so it can
> deadlock

Good point.
But this is a documentation issue, right?
Not too much is needed: in the simple case
you only do anonymous mmap() in a callback.
In a more difficult case (my use-case) I do
shm_open(), ftruncate() and mmap().
Of course I need to document that any libdl
functions are disallowed, is this correct?


>   and exposes dynamic linker internals (makes
> implementation changes harder later).

Could you please clarify that part?
typedef void *
(dlmem_premap_t) (void *mappref, size_t maplength, size_t mapalign,
               void *cookie);

This only passes things needed for an initial
mmap(). The single mmap() that just defines
the mapping address.
What internals do you mean?

Also please note that the callback is optional
and is not needed for 99% of the interesting
cases possible with dlmem().

> if all you want is force mapping below 4G then you should
> have asked for an RTLD_ flag (which may not be useful
> enough or may not be supportable on all relevant targets,
> but at least does not need significant loader changes).
There is already such flag.
Unfortunately my setup is much more
complicated than that: I need to force
the mapping under 4Gb AND I need to
force it into the shared buffer which I
then mmap() to another address...
Yes, this may sound weird, and I am not
sure if you want to read my entire discussion
with Jonathon on why is that needed.
But he did a good small summary here:
https://sourceware.org/bugzilla/show_bug.cgi?id=30007#c8

Yes, its a technically difficult topic.
Yes, we spent months on only me explaining
my use-case! But why it suddenly downgraded
to flames, is a bit unclear.
  
stsp March 31, 2023, 6:47 p.m. UTC | #20
31.03.2023 22:12, Szabolcs Nagy пишет:
> your dlmem design it passes down a user callback that runs
> under dynamic linker locks to map segments.
Oh, I missed that sentence initially.
No, its not to map segments.
Its to do the single initial mmap() only.
The segments are of course all mapped
by the loader internally.
Which is why this callback is optional and
should almost never be used.

Does this clarification help?

Btw, I am thankful to you for reading an
API description. Not too many people did
that. Carlos and Jonathon definitely did,
so I am thankful to them as well.
  
stsp March 31, 2023, 7 p.m. UTC | #21
Szabolcs,

31.03.2023 23:47, stsp пишет:
>
> 31.03.2023 22:12, Szabolcs Nagy пишет:
>> your dlmem design it passes down a user callback that runs
>> under dynamic linker locks to map segments.
> Oh, I missed that sentence initially.
> No, its not to map segments.
> Its to do the single initial mmap() only.
> The segments are of course all mapped
> by the loader internally.
> Which is why this callback is optional and
> should almost never be used.
>
> Does this clarification help?
Actually let me show you the fdlopen()
example on top of dlmem, to make it
crystal clear that the premap callback
has nothing to do with anything, and
should just be ignored:

static void *
fdlopen (int fd, int flags)
{
   off_t len;
   void *addr;
   void *handle;

   len = lseek (fd, 0, SEEK_END);
   lseek (fd, 0, SEEK_SET);
   addr = mmap (NULL, len, PROT_READ, MAP_PRIVATE, fd, 0);
   if (addr == MAP_FAILED)
     {
       printf ("cannot mmap, %s\n", strerror(errno));
       exit (EXIT_FAILURE);
     }
   handle = dlmem (addr, len, flags, NULL);
   munmap (addr, len);
   return handle;

}

After running this code, we have this /proc/self/maps:

7fb413101000-7fb413102000 r--p 00000000 00:28 17195405 /home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413102000-7fb413103000 r-xp 00001000 00:28 17195405 /home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413103000-7fb413104000 r--p 00002000 00:28 17195405 /home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413104000-7fb413105000 r--p 00002000 00:28 17195405 /home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413105000-7fb413106000 rw-p 00003000 00:28 17195405 /home/stas/src/glibc-dev/build/dlfcn/glreflib1.so

So the callback is not involved.
As you can see its set to NULL.
The loader is fully capable, callback
is not needed.

Does this clarify?
  
stsp April 1, 2023, 9:28 a.m. UTC | #22
31.03.2023 22:36, stsp пишет:
> There is already such flag.

Namely, LD_PREFER_MAP_32BIT_EXEC.


> Unfortunately my setup is much more
> complicated than that: I need to force
> the mapping under 4Gb AND I need to
> force it into the shared buffer which I
> then mmap() to another address...
> Yes, this may sound weird, and I am not
> sure if you want to read my entire discussion
> with Jonathon on why is that needed.
> But he did a good small summary here:
> https://sourceware.org/bugzilla/show_bug.cgi?id=30007#c8
>
An additional clarification is needed that
the LD_AUDIT stuff mentioned in that URL,
was later replaced exactly by the premap
callback (that only optionally sets the
mapping area).
  
Szabolcs Nagy April 3, 2023, 10:04 a.m. UTC | #23
The 03/31/2023 22:36, stsp wrote:
> 31.03.2023 22:12, Szabolcs Nagy пишет:
> > loading from special place and loading into special place
> > are different things.
> 
> These are related.
> File-based mmaps leave you no chance
> of preserving the destination mapping,

"preserving" the "destination mapping" (?)
a new requirement..

> > your dlmem design it passes down a user callback that runs
> > under dynamic linker locks to map segments. the interface
> > contract of this callback is not documented, so it can
> > deadlock
> 
> Good point.
> But this is a documentation issue, right?
> Not too much is needed: in the simple case
> you only do anonymous mmap() in a callback.
> In a more difficult case (my use-case) I do
> shm_open(), ftruncate() and mmap().
> Of course I need to document that any libdl
> functions are disallowed, is this correct?

that is not correct. (see below)

> >   and exposes dynamic linker internals (makes
> > implementation changes harder later).
> 
> Could you please clarify that part?
> typedef void *
> (dlmem_premap_t) (void *mappref, size_t maplength, size_t mapalign,
>               void *cookie);
> 
> This only passes things needed for an initial
> mmap(). The single mmap() that just defines
> the mapping address.
> What internals do you mean?

the segments of an elf file can be mmapped in many ways,
there is no such thing as initial mmap (there need not
be a mapping that covers all segments and there is no
ordering requirement for the mappings or which mapping
is file backed etc) your callback exposes a particular
behaviour.

and libc apis are exposed to not use locks that have
lock ordering requirement wrt the dynamic linker
internal lock. (e.g. we could add such a lock in mmap
or shm_open since they are not required to be as-safe.
so your callback can deadlock in a future glibc).

and it's not just locks but any unusual libc internal
state that becomes observable via the callback.
(the callback runs when the dynamic linker may be in
inconsistent internal state. why do you assume that
shm_open or other libc apis work in that case? this
is a new requirement that has to be documented.)

> Also please note that the callback is optional
> and is not needed for 99% of the interesting
> cases possible with dlmem().

it does not matter if it's 99% or 99.999%, what matters
is that if we expose an api then that has to work
*forever* under all potential usage and we have to
keep *supporting* it with all the api&abi constraints.

giving examples is not a *proof* that the api works.

> > if all you want is force mapping below 4G then you should
> > have asked for an RTLD_ flag (which may not be useful
> > enough or may not be supportable on all relevant targets,
> > but at least does not need significant loader changes).
> There is already such flag.

x86 specific flag.

> Unfortunately my setup is much more
> complicated than that: I need to force
> the mapping under 4Gb AND I need to
> force it into the shared buffer which I
> then mmap() to another address...

you didnt explain this in the rationale of the patches.

but it seems what you want is not suitable for libc.
you have to write your own loader to do this.
  
stsp April 3, 2023, 10:43 a.m. UTC | #24
Hi,

03.04.2023 15:04, Szabolcs Nagy пишет:
> "preserving" the "destination mapping" (?)
> a new requirement..

Indeed, which is the main point of that patch.
New requirement - new API.
Briefly, if the premap callback established some
MAP_SHARED mapping, then it will remain that
way after dlmem() finished working. But that's
an optional behavior, activated by a flag.
Normal tasks (like dlopen_with_offset()) need
nothing like that.
See the full details by an URL from my prev mail
about the use-case for such a thing.

> the segments of an elf file can be mmapped in many ways,
> there is no such thing as initial mmap (there need not
> be a mapping that covers all segments and there is no
> ordering requirement for the mappings or which mapping
> is file backed etc) your callback exposes a particular
> behaviour.

I think this is a misunderstanding.
Even before my patch there is this code in dl-load.c:

     /* Length of the sections to be loaded.  */
     maplength = loadcmds[nloadcmds - 1].allocend - loadcmds[0].mapstart;

So it calculates the maximum length to map, which
is then mapped in 1 go in _dl_map_segment().
There is also this comment (again, w/o any patches
of mine):

          So we map the first segment without MAP_FIXED, but with its
          extent increased to cover all the segments.  Then we remove
          access from excess portion, and there is known sufficient space
          there to remap from the later segments.

So I am working on an existing frame-work only.

> and libc apis are exposed to not use locks that have
> lock ordering requirement wrt the dynamic linker
> internal lock. (e.g. we could add such a lock in mmap
> or shm_open since they are not required to be as-safe.
> so your callback can deadlock in a future glibc).

Well you probably can't add such a lock into mmap(),
as the loader calls mmap() anyway. Premap callback
just does the same thing.
If you can add some lock to shm_open(), then
shm_open() can simply be done before calling dlmem().
The only thing to care here, is probably ftruncate().
Do you really think some lock can be added to
ftruncate()?

> and it's not just locks but any unusual libc internal
> state that becomes observable via the callback.
> (the callback runs when the dynamic linker may be in
> inconsistent internal state. why do you assume that
> shm_open or other libc apis work in that case? this
> is a new requirement that has to be documented.)

Mainly mmap(), and the loader does mmap()s by
himself, so I quite assume they work from the
callback.

> it does not matter if it's 99% or 99.999%, what matters
> is that if we expose an api then that has to work
> *forever* under all potential usage and we have to
> keep *supporting* it with all the api&abi constraints.

Would it be acceptable to settle on an agreement
to not add a lock to ftruncate() that would prevent
its use from a premap callback? If its not possible,
of course there is still an option to document the
premap callback in a way so it will have to do the
syscalls directly. Which doesn't look like a very
good choice, but possible.


> giving examples is not a *proof* that the api works.

The example was needed to demonstrate that
premap is not needed to map segments.


>> Unfortunately my setup is much more
>> complicated than that: I need to force
>> the mapping under 4Gb AND I need to
>> force it into the shared buffer which I
>> then mmap() to another address...
> you didnt explain this in the rationale of the patches.

I mentioned such a use-case actually in the
rationale, here's the quote:

More so, if DLMEM_DONTREPLACE flag is used, then the mapping
established by the premap callback, will not be replaced with the
file-backed mapping. In that case dlmem() have to use memcpy(), which
is likely even faster than mmaps() but doesn't end up with the proper
/proc/self/map_files or /proc/self/maps entries. So for example if the
premap callback uses MAP_SHARED, then with the use of the DLMEM_DONTREPLACE
flag you can get your solib relocated into a shared memory buffer.


Should I make is more verbose?

> but it seems what you want is not suitable for libc.
> you have to write your own loader to do this.
I very much wish to do that!
But so far I failed with a few naive attempts,
like building the patched glibc statically, or
loading custom libc.so into a separate name-space...
I don't know what interface should be exposed
by glibc to let my own loader to create a new
link-map and bind symbols to it. Maybe even
none, maybe I can do that via r_debug for example.
So I very much wish to work in that direction, if you
think it is a more appropriate solution that
can be accepted into glibc (if any changes are
at all needed).

However, dlmem() can be used for Android's
dlopen_with_offset(), which was also rejected
from glibc. With dlmem() its just a dozen of
lines that never need to be in glibc itself. So
for example the problem with dlopen_with_offset()
could be easily settled with my dlmem().
  
Szabolcs Nagy April 3, 2023, 12:01 p.m. UTC | #25
The 04/03/2023 15:43, stsp via Libc-alpha wrote:
> 03.04.2023 15:04, Szabolcs Nagy пишет:
> > the segments of an elf file can be mmapped in many ways,
> > there is no such thing as initial mmap (there need not
> > be a mapping that covers all segments and there is no
> > ordering requirement for the mappings or which mapping
> > is file backed etc) your callback exposes a particular
> > behaviour.
> 
> I think this is a misunderstanding.
> Even before my patch there is this code in dl-load.c:

these are all *internals*, there is nothing in the
public api (or even elf spec) that says this is how
this should be done.

> So I am working on an existing frame-work only.

no, you are exposing implementation internal details.

> > and libc apis are exposed to not use locks that have
> > lock ordering requirement wrt the dynamic linker
> > internal lock. (e.g. we could add such a lock in mmap
> > or shm_open since they are not required to be as-safe.
> > so your callback can deadlock in a future glibc).
> 
> Well you probably can't add such a lock into mmap(),
> as the loader calls mmap() anyway. Premap callback
> just does the same thing.

no, the loader does not call the public mmap libc api.
it calls internal __mmap that may behave differently.

the current implementation does not impose any
requirement on the public mmap symbol.

> Do you really think some lock can be added to
> ftruncate()?

we have to reason about the constraints not second
guess what others think, we have plenty syscalls
emulated in the libc for posix conformance reasons
doing non-trivial things and various libc apis are
interposed by user code (see asan) and then they can
do whatever (e.g. asan interposes mmap, but cannot
interpose the internal __mmap).

> > and it's not just locks but any unusual libc internal
> > state that becomes observable via the callback.
> > (the callback runs when the dynamic linker may be in
> > inconsistent internal state. why do you assume that
> > shm_open or other libc apis work in that case? this
> > is a new requirement that has to be documented.)
> 
> Mainly mmap(), and the loader does mmap()s by
> himself, so I quite assume they work from the
> callback.

again, this is wrong.

> > it does not matter if it's 99% or 99.999%, what matters
> > is that if we expose an api then that has to work
> > *forever* under all potential usage and we have to
> > keep *supporting* it with all the api&abi constraints.
> 
> Would it be acceptable to settle on an agreement
> to not add a lock to ftruncate() that would prevent
> its use from a premap callback? If its not possible,
> of course there is still an option to document the
> premap callback in a way so it will have to do the
> syscalls directly. Which doesn't look like a very
> good choice, but possible.

a user callback must not run in a context where libc
internals are exposed (e.g. under internal locks).
if you do that you need very strong justification
and very careful specification of the api. (moving
the callback outside the dynlinker locks or passing
down flags instead of a callback would solve this
particular issue.)

> > > Unfortunately my setup is much more
> > > complicated than that: I need to force
> > > the mapping under 4Gb AND I need to
> > > force it into the shared buffer which I
> > > then mmap() to another address...
> > you didnt explain this in the rationale of the patches.
> 
> I mentioned such a use-case actually in the
> rationale, here's the quote:
> 
> More so, if DLMEM_DONTREPLACE flag is used, then the mapping
> established by the premap callback, will not be replaced with the
> file-backed mapping. In that case dlmem() have to use memcpy(), which
> is likely even faster than mmaps() but doesn't end up with the proper
> /proc/self/map_files or /proc/self/maps entries. So for example if the
> premap callback uses MAP_SHARED, then with the use of the DLMEM_DONTREPLACE
> flag you can get your solib relocated into a shared memory buffer.

there are so many unexplained details here.. we have
to guess what you have in mind. (should tls also end
up in shared memory? what if the libc wants to generate
executable code per module for instrumentation? e.g.
for PLT hooking, should that go to the shared mapping
too? what if the malloc implementation reuses bits of
memory from shared libs they waste due to page alignment
is that legal? we can answer these questions if you
either specify the interface contracts or the point of
all of this, but right now it's anybody's guess.)

libc will likely not include an api to load libs into
shared memory. (e.g. aarch64 has features that work on
anon or file backed memory but not on shared memory and
some systems may prevent it for security reasons, so
this has at least portability problems, but it also
clearly constrains the implementation.)

> > but it seems what you want is not suitable for libc.
> > you have to write your own loader to do this.
> I very much wish to do that!
> But so far I failed with a few naive attempts,
> like building the patched glibc statically, or
> loading custom libc.so into a separate name-space...

you can load a shared lib / generate code that does
not use normal libc shared library abi, but can
call into libc (or some other library) or can be
called from normal libraries. this is kind of what
jits do.

> I don't know what interface should be exposed
> by glibc to let my own loader to create a new
> link-map and bind symbols to it. Maybe even
> none, maybe I can do that via r_debug for example.
> So I very much wish to work in that direction, if you
> think it is a more appropriate solution that
> can be accepted into glibc (if any changes are
> at all needed).
> 
> However, dlmem() can be used for Android's
> dlopen_with_offset(), which was also rejected
> from glibc. With dlmem() its just a dozen of
> lines that never need to be in glibc itself. So
> for example the problem with dlopen_with_offset()
> could be easily settled with my dlmem().

there is probably a reason why that api got rejected
and likely the same reason applies to dlmem.
  
stsp April 3, 2023, 1:07 p.m. UTC | #26
Hello,

03.04.2023 17:01, Szabolcs Nagy пишет:
> these are all *internals*, there is nothing in the
> public api (or even elf spec) that says this is how
> this should be done.

Yes, OTOH of course its reasonable to expect
that the solib's mapping size's upper limit can
be estimated before doing the actual mapping.
So I rely on such estimation code which is already
there.

Maybe it would be a solution to add another
function that will return such a size estimation?
Well, I am sure this also have dozens of down-sides...

>>> and libc apis are exposed to not use locks that have
>>> lock ordering requirement wrt the dynamic linker
>>> internal lock. (e.g. we could add such a lock in mmap
>>> or shm_open since they are not required to be as-safe.
>>> so your callback can deadlock in a future glibc).
>> Well you probably can't add such a lock into mmap(),
>> as the loader calls mmap() anyway. Premap callback
>> just does the same thing.
> no, the loader does not call the public mmap libc api.
> it calls internal __mmap that may behave differently.
>
> the current implementation does not impose any
> requirement on the public mmap symbol.

An option may be to pass a premap callback
with a pointers to such __mmap, __ftruncate. But
of course not the best option again. :(

> a user callback must not run in a context where libc
> internals are exposed (e.g. under internal locks).
> if you do that you need very strong justification
> and very careful specification of the api. (moving
> the callback outside the dynlinker locks or passing
> down flags instead of a callback would solve this
> particular issue.)

I considered passing flags and fd.
This makes things much less flexible, but
the real problem here is to set the relocation
address. premap callback just returns it
after doing the mapping of a requested size.
Without knowing the estimated size beforehand,
you unfortunately can't easily go that route. :(
The func to get the estimated size, could help.


> there are so many unexplained details here.. we have
> to guess what you have in mind. (should tls also end
> up in shared memory? what if the libc wants to generate
> executable code per module for instrumentation? e.g.
> for PLT hooking, should that go to the shared mapping
> too? what if the malloc implementation reuses bits of
> memory from shared libs they waste due to page alignment
> is that legal? we can answer these questions if you
> either specify the interface contracts or the point of
> all of this, but right now it's anybody's guess.)

No, no, its much simpler! :)
I just put _entire_ solib into a shared mapping.
I do not care about tls or anything.
All I do is create the initial large mapping, that
remains unaffected by the loader. Even the
inter-segment gaps you mentioned, or the
anonymously mapped parts like .bss - just all
go into that initial shared mapping. Which is
why I haven't described such details. That are
simply not important in that case.

> libc will likely not include an api to load libs into
> shared memory. (e.g. aarch64 has features that work on
> anon or file backed memory but not on shared memory and
> some systems may prevent it for security reasons, so
> this has at least portability problems, but it also
> clearly constrains the implementation.)

Well, from the API stand-point its not mapping
into a shared memory. Its mapping into where's
the premap callback decided. Can we assume
that the premap callback is responsible to not
do stupid things on aarch64, where mapping
to shared memory will not work as expected?


> you can load a shared lib / generate code that does
> not use normal libc shared library abi, but can
> call into libc (or some other library) or can be
> called from normal libraries. this is kind of what
> jits do.

I wanted to load a solib and get its DT_NEEDED
deps all resolved. If I had some way of linking
things with my patched, statically built libc, that
would work. Maybe I can rip off the glibc loader
and somehow build it separately, etc. So what
you say, may work, but seems like a tremendous
task of either writing a full loader or ripping off
the one from glibc...
And even that route will only allow me to use
such libs in a separate NS, as in base NS it
won't have a link-map visible by glibc. But
having that in a separate NS is acceptable for
my needs, so eventually I may go that route
if all else fails.


> there is probably a reason why that api got rejected
> and likely the same reason applies to dlmem.
Likely, but I just thought having an android
extensions in, is not so bad for glibc (of course
its not me to decide). So having a building block
that supports dlopen_with_offset() being externally
implemented, looked like a tasty thing for glibc,
but perhaps it wasn't the case.
  
stsp April 5, 2023, 7:29 a.m. UTC | #27
Hi Szabolcs,

03.04.2023 17:01, Szabolcs Nagy пишет:
> a user callback must not run in a context where libc
> internals are exposed (e.g. under internal locks).
I checked that part a bit more, and so far
my findings are (let me know if they are wrong):
- That code indeed works under dl_load_lock,
   but it calls audit callbacks rather freely, and
   I don't see it releasing the lock before calling
   the user audit callbacks.

- dl_iterate_phdr() seems to be calling the user
   callback under dl_load_write_lock lock.

So my callback is called in a same context
where audit callbacks are called, and also
other callbacks seem to have no problem
being called under various locks.

So in that sense why my callback is different,
or what am I missing?
  
Szabolcs Nagy April 5, 2023, 8:51 a.m. UTC | #28
The 04/05/2023 12:29, stsp wrote:
> Hi Szabolcs,
> 
> 03.04.2023 17:01, Szabolcs Nagy пишет:
> > a user callback must not run in a context where libc
> > internals are exposed (e.g. under internal locks).
> I checked that part a bit more, and so far
> my findings are (let me know if they are wrong):
> - That code indeed works under dl_load_lock,
>   but it calls audit callbacks rather freely, and
>   I don't see it releasing the lock before calling
>   the user audit callbacks.
> 
> - dl_iterate_phdr() seems to be calling the user
>   callback under dl_load_write_lock lock.

this is a known bug.

(audit is special because it is not an api for applications
but for tooling that hooks into dynamic linker internals,
it has its own abi versioning so we can break the abi more
easily and those tools are expected to cope with it. it is
necessarily exposing internals, so new hooks are more likely
acceptable there, but that's not true for apis designed for
application use.)

a much bigger issue is calling user ctors under the
dynamic linker lock which did cause issues in practice.

another issue is calling malloc under locks when malloc
can be interposed (likewise it caused issues in practice).

i did not say glibc was clean from these problems, but
that does not mean we want to add more to it.

> 
> So my callback is called in a same context
> where audit callbacks are called, and also
> other callbacks seem to have no problem
> being called under various locks.

they have problems.

in any case dlmem has way too many issues beyond
the locking one, the fact that you don't document
nor understand the issues around the huge interface
boundary it creates does not help either.

to successfully propose a new api you need to use
the same rigor as posix or c standardization would.
  
stsp April 5, 2023, 9:26 a.m. UTC | #29
05.04.2023 13:51, Szabolcs Nagy пишет:
> The 04/05/2023 12:29, stsp wrote:
>> Hi Szabolcs,
>>
>> 03.04.2023 17:01, Szabolcs Nagy пишет:
>>> a user callback must not run in a context where libc
>>> internals are exposed (e.g. under internal locks).
>> I checked that part a bit more, and so far
>> my findings are (let me know if they are wrong):
>> - That code indeed works under dl_load_lock,
>>    but it calls audit callbacks rather freely, and
>>    I don't see it releasing the lock before calling
>>    the user audit callbacks.
>>
>> - dl_iterate_phdr() seems to be calling the user
>>    callback under dl_load_write_lock lock.
> this is a known bug.
>
> (audit is special because it is not an api for applications
> but for tooling that hooks into dynamic linker internals,
I had the version of the patch that
added an audit callback instead of
the direct callback.
Do you think that's going to be better?
In that case the dlmem() API is only
documented for the basic tasks of loading
a solib from memory.
Audit module can then add an advanced
functionality, but its not going to be a
part of glibc.
The problems around documenting maplength
are not available in such an impl, since
that became internal to an audit callback.


> in any case dlmem has way too many issues beyond
> the locking one, the fact that you don't document
> nor understand the issues around the huge interface
> boundary it creates does not help either.
I understand the problem around maplength,
and I wonder if getting back to an audit
callback may be a solution to this and all
the other problems you mentioned.
Essentially in this case glibc has only a very
small and simple call, and the rest is implemented
by an author of an audit module.
  
Florian Weimer April 5, 2023, 9:31 a.m. UTC | #30
* Szabolcs Nagy via Libc-alpha:

> The 04/05/2023 12:29, stsp wrote:
>> - dl_iterate_phdr() seems to be calling the user
>>   callback under dl_load_write_lock lock.
>
> this is a known bug.

It's also not something we can fix because the libgcc unwinder has code
on it that relies on this implicit loader lock to protect its internal
data structures.  The libgcc unwinder can be statically linked, so we
can't remove the locking without adding a new symbol version.

I suspect other uses of dl_iterate_phdr are similar.

Thanks,
Florian
  
stsp April 12, 2023, 5:23 p.m. UTC | #31
Hello,

05.04.2023 13:51, Szabolcs Nagy пишет:
> this is a known bug.
>
> (audit is special because it is not an api for applications
> but for tooling that hooks into dynamic linker internals,
> it has its own abi versioning so we can break the abi more
> easily and those tools are expected to cope with it. it is
> necessarily exposing internals, so new hooks are more likely
> acceptable there, but that's not true for apis designed for
> application use.)
>
> a much bigger issue is calling user ctors under the
> dynamic linker lock which did cause issues in practice.
>
> another issue is calling malloc under locks when malloc
> can be interposed (likewise it caused issues in practice).
>
> i did not say glibc was clean from these problems, but
> that does not mean we want to add more to it.
OK, in an attempt to address these concerns
I am currently trying to implement an entirely
different approach.
The basic idea is that dlmem() can leave an
object unrelocated. This is a small change,
relocation is currently done in dl_open_worker_begin().
But I can do it upon a subsequent call of an
any dlfcn function with the handle of that object,
by just checking the l->relocated flag that is
already present in the link_map struct.
The same "lazy relocation" can even be
applied to a regular dlopen().

Now the trick is that when you have an
unrelocated object, you can move it by
hands to the desired address (preserving
the page protection, that's easy) and call
dlsetbase(handle, address);
to update a couple of link-map pointers.
dlsetbase() would fail if the object is
already relocated, so it should be called
right after dlmem() or dlopen(), or not called
at all if you don't need to alter the solib
location.
To get a mapaddr and maplength of a loaded
solib I am going to add a trivial getters for the
needed values from a link-map. These are
only needed if one wants to move the loaded
unrelocated solib.

That still fits my use-case very well, doesn't
require a callback and doesn't require the user
to know the maplength beforehand.
So it would seem to address all of your concerns,
and overall seems a much simpler approach.
But it would require an additional dlsetaddr()
trivial function and a few link-map getters,
besides dlmem() itself.

Would this approach be viable?
  
stsp April 12, 2023, 6 p.m. UTC | #32
12.04.2023 22:23, stsp пишет:
> The same "lazy relocation" can even be
> applied to a regular dlopen().
Of course it can't, as dlopen() is expected
to call library ctors. So only with an optional
dlmem() flag such behavior should be performed,
in which case dlsetbase() should also do the
final relocation and ctors calling.
  
Rich Felker April 12, 2023, 6:20 p.m. UTC | #33
On Wed, Apr 12, 2023 at 11:00:19PM +0500, stsp wrote:
> 
> 12.04.2023 22:23, stsp пишет:
> >The same "lazy relocation" can even be
> >applied to a regular dlopen().
> Of course it can't, as dlopen() is expected
> to call library ctors. So only with an optional
> dlmem() flag such behavior should be performed,
> in which case dlsetbase() should also do the
> final relocation and ctors calling.

Have you not realized yet how far outside any concept of "reasonable"
this ever-expanding contract is?

Whatever happens, this whole proposal is going to be rejected, over
and over, and the more effort you pour into engineering it on the
belief that it will somehow eventually be accepted, the more
disappointed (and maybe angry?) you're going to be at the people who
have to keep rejecting it. For the sake of your time and that of the
folks who have been pulled into this discussion, I think it would
really, *really* be a good idea to go figure out how to solve your
practical problem without trying to put new functionality in libc.

Rich
  
stsp April 12, 2023, 6:46 p.m. UTC | #34
12.04.2023 23:20, Rich Felker пишет:
> On Wed, Apr 12, 2023 at 11:00:19PM +0500, stsp wrote:
>> 12.04.2023 22:23, stsp пишет:
>>> The same "lazy relocation" can even be
>>> applied to a regular dlopen().
>> Of course it can't, as dlopen() is expected
>> to call library ctors. So only with an optional
>> dlmem() flag such behavior should be performed,
>> in which case dlsetbase() should also do the
>> final relocation and ctors calling.
> Have you not realized yet how far outside any concept of "reasonable"
> this ever-expanding contract is?

How can I realize that, if it was never
explained? In fact, the only API review
was from Szabolcs, and he just pointed
to 2 problems that are seemingly solvable.

> Whatever happens, this whole proposal is going to be rejected, over
> and over, and the more effort you pour into engineering it on the
> belief that it will somehow eventually be accepted, the more
> disappointed (and maybe angry?) you're going to be at the people who
> have to keep rejecting it.

If no one spells out the proper reason,
then maybe.
Is it that difficult to spell out the reason
of a rejection?


>   For the sake of your time and that of the
> folks who have been pulled into this discussion, I think it would
> really, *really* be a good idea to go figure out how to solve your
> practical problem without trying to put new functionality in libc.
In fact, instead of spelling out the rejection
reason, all discussions were revolving around
solving my problem w/o modifying libc.
Basically there are currently 2 variations of
such solutions:
1. Intercept glibc's loader mmap()s on a syscall
level and bind a non-trivial logic to them (Jonathon's
proposal)
2. Implement the dynamic loader outside of glibc
(Szabolcs's proposal, and also my own wish).

I'd be happy to implement 2, but glibc doesn't
have the hooks needed for external dynamic
loader, and it doesn't even allow to use the
statically-linked patched glibc as such a loader
in a separate namespace.
Which is why so far I haven't seen a viable
proposal on solving my problem, and haven't
seen a clear explanation why the change to
glibc is going to be rejected even if the raised
concerns are all addressed.
  
stsp April 12, 2023, 7:07 p.m. UTC | #35
12.04.2023 23:20, Rich Felker пишет:
> On Wed, Apr 12, 2023 at 11:00:19PM +0500, stsp wrote:
>> 12.04.2023 22:23, stsp пишет:
>>> The same "lazy relocation" can even be
>>> applied to a regular dlopen().
>> Of course it can't, as dlopen() is expected
>> to call library ctors. So only with an optional
>> dlmem() flag such behavior should be performed,
>> in which case dlsetbase() should also do the
>> final relocation and ctors calling.
> Have you not realized yet how far outside any concept of "reasonable"
> this ever-expanding contract is?
Would it be more consistent to always
require a pair of dlmem()/dlinitialize(handle, addr) calls?
Or the 2-step loading is out of any considerations?
  
Zack Weinberg April 12, 2023, 7:52 p.m. UTC | #36
On Wed, Apr 12, 2023, at 2:46 PM, stsp via Libc-alpha wrote:
> 12.04.2023 23:20, Rich Felker пишет:
>> On Wed, Apr 12, 2023 at 11:00:19PM +0500, stsp wrote:
>>> 12.04.2023 22:23, stsp пишет:
>>>> The same "lazy relocation" can even be applied to a regular
>>>> dlopen().
>>> Of course it can't, as dlopen() is expected to call library ctors.
>>> So only with an optional dlmem() flag such behavior should be
>>> performed, in which case dlsetbase() should also do the final
>>> relocation and ctors calling.
>> Have you not realized yet how far outside any concept of "reasonable"
>> this ever-expanding contract is?
>
> How can I realize that, if it was never explained? In fact, the only
> API review was from Szabolcs, and he just pointed to 2 problems that
> are seemingly solvable.

This discussion keeps going in circles.  You, Stas, think you aren't
getting any explanation for why your API proposals are being rejected.
Meanwhile, Szabolcs, Adhemerval, Carlos, Rich, etc. all think they ARE
giving you explanations for why they don't like this API.

Let me try to highlight specific issues from the feedback you've been
getting.  It appears to me that your *API* proposal -- not any details
of its implementation, the API itself -- is unacceptable *BECAUSE*:

* It introduces a callback from the dynamic loader into user code, that
  is to be run with an internal lock held. glibc does already have
  callbacks like that, but we do not want to introduce any more of
  them, period.
* It assumes that the dynamic loader will always, in the future, operate
  by creating a large initial mapping and then modifying protection
  flags on it.
* It assumes that it the dynamic loader is, and will always be, capable
  of taking a pointer to a block of memory -- no matter how it was
  originally allocated -- and then chopping its virtual memory map up to
  satisfy the requirements of the embedded ELF program header table,
  without breaking anything else.  [On this specific note, I foresee
  someone trying to write a dynamic module into memory allocated with
  memalign() or even plain old malloc(); calling dlmem() on this is very
  likely to apply PROT_NONE to address ranges containing malloc
  metadata!]

Carlos also gave you the specific feedback that he thinks an API
structured around handing the dynamic loader a file descriptor, not a
pointer, would be much more likely to be accepted.  Your response to
this was "dlmem() can optionally preserve the destination mapping, which
is not possible by definition with any file-based API."  I accept that
_you_ think that, but I don't think it's true, myself.  If your goal is
to force a shared library to be loaded into memory with specific
characteristics, I can think of at least two designs that ought to be
able to achieve that goal while still starting from the same place as
fdlopen().  For instance, maybe we could have an fdlopen variant that
called callback functions instead of mmap() and mprotect().  Unlike
your dlmem_premap_t, I think it ought to be possible to avoid doing
those specific operations with any locks held.

zw
  
stsp April 13, 2023, 10:01 a.m. UTC | #37
12.04.2023 23:20, Rich Felker пишет:
> On Wed, Apr 12, 2023 at 11:00:19PM +0500, stsp wrote:
>> 12.04.2023 22:23, stsp пишет:
>>> The same "lazy relocation" can even be
>>> applied to a regular dlopen().
>> Of course it can't, as dlopen() is expected
>> to call library ctors. So only with an optional
>> dlmem() flag such behavior should be performed,
>> in which case dlsetbase() should also do the
>> final relocation and ctors calling.
> Have you not realized yet how far outside any concept of "reasonable"
> this ever-expanding contract is?

OK, consider this small proposal (not dlmem):

RTLD_NORELOC - new flag to skip the relocation step.

dlrelocate(handle, base) - new call to explicitly
perform the relocation (and call ctors) that was
skipped with RTLD_NORELOC flag.

That alone already solves most of what I wanted
to achieve with dlmem(). Namely, it allows the
user to specify the relocation address he needs.
He does so by manually moving the solib image
to the desired location, and then calling dlrelocate()
with new base address.

Could you please tell me beforehand why that
proposal will be rejected?


> Whatever happens, this whole proposal is going to be rejected, over
> and over,

I believe you, but the question is still "why".
In all the other projects I contributed in, the
proposals were in most cases getting the
change requests. And in glibc, the only available
change request is to drop the proposal? :)


>   and the more effort you pour into engineering it on the
> belief that it will somehow eventually be accepted, the more
> disappointed (and maybe angry?) you're going to be at the people who
> have to keep rejecting it.
This is a wrong assumption, because in a
worst case this code will make it into a standalone
loader. So far I know how to do that only until
glibc changes one of its core structs, like rtld_global_ro
or link_map. With every such change, the loader
will stop working. So that would be a very unfortunate
and unreliable solution. I'd like to avoid it, but
there is no risk that my efforts working on that
code will be wasted.
  
Szabolcs Nagy April 13, 2023, 12:38 p.m. UTC | #38
The 04/13/2023 15:01, stsp wrote:
> 12.04.2023 23:20, Rich Felker пишет:
> > On Wed, Apr 12, 2023 at 11:00:19PM +0500, stsp wrote:
> > > 12.04.2023 22:23, stsp пишет:
> > > > The same "lazy relocation" can even be
> > > > applied to a regular dlopen().
> > > Of course it can't, as dlopen() is expected
> > > to call library ctors. So only with an optional
> > > dlmem() flag such behavior should be performed,
> > > in which case dlsetbase() should also do the
> > > final relocation and ctors calling.
> > Have you not realized yet how far outside any concept of "reasonable"
> > this ever-expanding contract is?
> 
> OK, consider this small proposal (not dlmem):
> 
> RTLD_NORELOC - new flag to skip the relocation step.
> 
> dlrelocate(handle, base) - new call to explicitly
> perform the relocation (and call ctors) that was
> skipped with RTLD_NORELOC flag.
> 
> That alone already solves most of what I wanted
> to achieve with dlmem(). Namely, it allows the
> user to specify the relocation address he needs.
> He does so by manually moving the solib image
> to the desired location, and then calling dlrelocate()
> with new base address.
> 
> Could you please tell me beforehand why that
> proposal will be rejected?

it does not allow handling relocation failures sensibly,
does not specify what happens with dependencies and
exposes half loaded modules to users with unclear use-case
or semantics for them.

> > Whatever happens, this whole proposal is going to be rejected, over
> > and over,
> 
> I believe you, but the question is still "why".
> In all the other projects I contributed in, the
> proposals were in most cases getting the
> change requests. And in glibc, the only available
> change request is to drop the proposal? :)

if you fix an actual bug that's accepted.

you are adding a new interface without gathering consensus
that it's a good idea at all or how it should work.

> >   and the more effort you pour into engineering it on the
> > belief that it will somehow eventually be accepted, the more
> > disappointed (and maybe angry?) you're going to be at the people who
> > have to keep rejecting it.
> This is a wrong assumption, because in a
> worst case this code will make it into a standalone
> loader. So far I know how to do that only until
> glibc changes one of its core structs, like rtld_global_ro
> or link_map. With every such change, the loader
> will stop working. So that would be a very unfortunate
> and unreliable solution. I'd like to avoid it, but
> there is no risk that my efforts working on that
> code will be wasted.

when i said you should write your own loader i expected
you to create simple stubs in memory at the location
you want them to be and bind them to a library that
is loaded by libc (so you can connect your weird world
with the libc world).

it looked like a simple solution to me, but i still dont
know exactly your requirements so in a sense we still
don't have a well understood use-case for any of your
proposed changes. so maye the use-case is where you
should start.
  
stsp April 13, 2023, 3:59 p.m. UTC | #39
Hi,

13.04.2023 17:38, Szabolcs Nagy пишет:
> it does not allow handling relocation failures sensibly,
> does not specify what happens with dependencies and
> exposes half loaded modules to users with unclear use-case
> or semantics for them.

Thanks, lets update the spec to address this:

RTLD_NORELOC - do not relocate the object, allowing
to perform the relocation later. Ctors are delayed, and
are called immediately after the relocation is done.
Relocation is performed upon the first operation on
the returned handle, before the requested operation
on a handle is performed, with the exception of
dlset_object_baseaddr() which doesn't perform the
relocation. If this flag is used, it is recommended to
use dlrelocate() to explicitly perform the relocation
and check the result of it. This flag doesn't delay
the load of an object deps and the calls to their ctors.
It also doesn't delay the LA_ACT_CONSISTENT
audit event.

dlset_object_baseaddr(handle, addr) - sets the base
address of an unrelocated object. Returns error if
the object is already relocated. The base address
set by this function, will be used when relocation
is performed.

dlrelocate(handle) - perform the object relocation
if the RTLD_NORELOC flag was used, return EINVAL
if already relocated. This function may be omitted
even if RTLD_NORELOC was used, in which case
the relocation will be performed upon the first
operation on a handle, but using that function
allows to handle relocation errors and run ctors
before using the object's handle. If the function
returned success then ctors of an object were
called by it. If it returned error other than EINVAL
(EINVAL means object already relocated), then
relocation error happened and the handle should
be closed with dlclose().


It seems that spec covers all of your concerns.
It does cover even the exposure of "half-loaded
modules" by stating that the relocation may
happen lazily so things would work as expected
in any case.

> if you fix an actual bug that's accepted.

I believe this classifies as a fix:
https://patchwork.ozlabs.org/project/glibc/patch/20230301150818.31815-2-stsp2@yandex.ru/

> you are adding a new interface without gathering consensus
> that it's a good idea at all or how it should work.

I can get such a consensus only if people,
like you do (many thanks for that!), explain
the problems to me and suggest an amendments.
Unfortunately, its not the case. People are
saying that my proposal is bad, and the only
suggested amendment is to drop it immediately. :)

But I am very thankful to those who at least did
some side-activities for me, like discussing my
use-case and reviewing part of the patches.
These activities were very valuable, but the real
problems were never discussed before, proposed
API was not reviewed, and so on.

And with your input I already advanced to the
point where the API can be split into 2 parts,
whereas before I thought of dlmem() as a
thing in itself that can't be split. Now I see
that RTLD_NORELOC can solve the biggest part
of the puzzle alone.


> when i said you should write your own loader i expected
> you to create simple stubs in memory at the location
> you want them to be and bind them to a library that
> is loaded by libc (so you can connect your weird world
> with the libc world).
>
> it looked like a simple solution to me, but i still dont
> know exactly your requirements so in a sense we still
> don't have a well understood use-case for any of your
> proposed changes. so maye the use-case is where you
> should start.
You mean "restart". :)
As I already explained it very thoroughly in the
bugzilla, but you missed that and now its better
to repeat, since the bugzilla entry had a very
lengthy discussions afterwards.

The use-case is a VM with 32bit code inside.
It has a 4Gb window mapped somewhere in
a 64bit space (call it VM_window_start).
The task is very simple: first I find the suitable
mapping address within a VM address space.
Such address is within the 0-4Gb boundaries.
I relocate the solib to that "low" address.
Unfortunately this is still not visible to the VM,
because VM's window is at VM_window_start,
as we remember. So in order for VM-running
32bit code to access the solib's data (like .bss,
.rodata) when it gets a 32bit pointer to such
data, I also need to map the full duplicate of
the solib object into the address that looks like:

dupe_addr = VM_window_start + reloc_addr

For that I invented the optional dlmem flag
that doesn't destroy the destination mapping.
But such flag doesn't survive your requirement
to get rid of a call-back, so in the next prop
I'll pass a shm fd to dlmem instead, and an
optional flag will ask dlmem() to mmap that
fd as a destination backing-store.

So I am sure all problems can be solved!
But of course I can't do that alone, if everyone
just keeps saying "drop your patches immediately".
  
Adhemerval Zanella Netto April 13, 2023, 6:09 p.m. UTC | #40
On 13/04/23 12:59, stsp wrote:
> 
>> you are adding a new interface without gathering consensus
>> that it's a good idea at all or how it should work.
> 
> I can get such a consensus only if people,
> like you do (many thanks for that!), explain
> the problems to me and suggest an amendments.
> Unfortunately, its not the case. People are
> saying that my proposal is bad, and the only
> suggested amendment is to drop it immediately. :)
> 
> But I am very thankful to those who at least did
> some side-activities for me, like discussing my
> use-case and reviewing part of the patches.
> These activities were very valuable, but the real
> problems were never discussed before, proposed
> API was not reviewed, and so on.

It is being tiring to work with your proposal because Rich already
brought the inherent issue of exposing the loader internals for ELF
objects [1] about 10 years ago, Szabolcs and myself are constantly
trying to say that is not good libc interface (regardless it solved
your specific problem), and even Jonathan tried to explore different 
alternatives.

The main problem is no one from glibc is comfortable, including myself,
to maintain and work with this interface. I don't want to maintain
another interface with a lot of corner cases.

Also the way approaching the glibc community, by not listening to 
criticism and ignoring multiple remarks; saying that you have addresses
the comments without any ack; or flooding the maillist with multiple 
version without addressing previous feedback is really, and I want to 
stress the really here, tiring.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=11767

> 
> And with your input I already advanced to the
> point where the API can be split into 2 parts,
> whereas before I thought of dlmem() as a
> thing in itself that can't be split. Now I see
> that RTLD_NORELOC can solve the biggest part
> of the puzzle alone.

Sigh, I really would like to say this is less rude manner but I will
be succinct so you and others can avoid waste time in this manner.
The "dlopen of in-memory ET_DYN or ET_EXEC object" is a *NACK* from
me, even if is split in multiple APIs.  Myself, Szabolsh, and Rich
already explained, but you seemed resolute to not accept it.

A fdlopen or even a dlopen_with_offset could make sense, FreeBSD
at least has fdlopen so it is a precedent of a realworld case.

> 
> 
>> when i said you should write your own loader i expected
>> you to create simple stubs in memory at the location
>> you want them to be and bind them to a library that
>> is loaded by libc (so you can connect your weird world
>> with the libc world).
>>
>> it looked like a simple solution to me, but i still dont
>> know exactly your requirements so in a sense we still
>> don't have a well understood use-case for any of your
>> proposed changes. so maye the use-case is where you
>> should start.
> You mean "restart". :)
> As I already explained it very thoroughly in the
> bugzilla, but you missed that and now its better
> to repeat, since the bugzilla entry had a very
> lengthy discussions afterwards.
> 
> The use-case is a VM with 32bit code inside.
> It has a 4Gb window mapped somewhere in
> a 64bit space (call it VM_window_start).
> The task is very simple: first I find the suitable
> mapping address within a VM address space.
> Such address is within the 0-4Gb boundaries.
> I relocate the solib to that "low" address.
> Unfortunately this is still not visible to the VM,
> because VM's window is at VM_window_start,
> as we remember. So in order for VM-running
> 32bit code to access the solib's data (like .bss,
> .rodata) when it gets a 32bit pointer to such
> data, I also need to map the full duplicate of
> the solib object into the address that looks like:
> 
> dupe_addr = VM_window_start + reloc_addr
> 
> For that I invented the optional dlmem flag
> that doesn't destroy the destination mapping.
> But such flag doesn't survive your requirement
> to get rid of a call-back, so in the next prop
> I'll pass a shm fd to dlmem instead, and an
> optional flag will ask dlmem() to mmap that
> fd as a destination backing-store.
> 
> So I am sure all problems can be solved!
> But of course I can't do that alone, if everyone
> just keeps saying "drop your patches immediately".
  
stsp April 13, 2023, 6:59 p.m. UTC | #41
Hi Adhemerval,

13.04.2023 23:09, Adhemerval Zanella Netto пишет:
> It is being tiring to work with your proposal because Rich already
> brought the inherent issue of exposing the loader internals for ELF
> objects [1] about 10 years ago,

My proposal did not exist 10 years ago.
Maybe we all do not properly document
our proposals or an objections. So let me
ask you to please refer to a particular
comment rather than to some 10 years
ago thread, and then carefully document
how can that be the problem in my patches.

I think its a fair ask, e.g. Szabolcs asked
me for a more precise spec - I write. We
progress. Why can't you write the objections
in a detalization level that is enough to
make a progress?

I admit you probably couldn't do that initially,
because I poorly documented my API. But
when Carlos asked for a more detailed spec,
I did. Now you can express your objections
in a detailed manner, let me even attach
the prev API draft to help you doing that.
Of course this draft will be simplified a lot,
but for such a "generic" objections please
use its current form.

>   Szabolcs and myself are constantly
> trying to say that is not good libc interface (regardless it solved
> your specific problem),

And regardless that it was never even discussed.
Part of the failure is on me: I wrote an API draft
only when Carlos asked, I should probably do that
before anything else. Sorry, I am a novice. But its
written now for quite long, so why nothing changes?


>   and even Jonathan tried to explore different
> alternatives.

I am thankful to Jonathon for doing that.
Unfortunately I am unsatisfied with the proposal
of intercepting glibc's internal mmap()s on a
syscall level, and binding a non-trivial (I mean,
very non-trivial!) logic on them. But I am thankful
he proved this is even possible, as I was confident
in an otherwise.

> The main problem is no one from glibc is comfortable, including myself,
> to maintain and work with this interface.

Which no one, besides Szabolcs, have actually
even looked into? It is attached to this e-mail.
Why not to make the complains direct and precise?


>   I don't want to maintain
> another interface with a lot of corner cases.

I have expressed the plans at removing all the
corner cases that Szabolcs pointed to. If you
point more, I'll definitely take an actions.
Off-loading the biggest part to RTLD_NORELOC
will reduce the proposal considerably, avoid
the callback and most of other complexity, so
why such a prejudice? Why can't we just discuss,
amend, see what happens?

> Also the way approaching the glibc community, by not listening to
> criticism and ignoring multiple remarks;

I wonder if Szabolcs also thinks I ignore his
remarks. If you think I ignore yours, then either
that was a misunderstanding on my part, or
they were not referring to my API proposal
(which is attached), or something else from that
line, but definitely not my reluctance to change
the proposal. I want to address every possible
problem raised, but unfortunately so far I only
see the problems raised by Szabolcs.
He details them well and targets over my API
proposals. Its immediately obvious what does
he mean, and after his last review I spent a
full month (!) just thinking how to get rid of
a call-back, and finally got to this RTLD_NORELOC
proposal as a result.


>   saying that you have addresses
> the comments without any ack;

That's a bit strange. Yes, I do some actions
based on a comments I get, then post the
new RFC and write in a log that I addressed
this and that. If there is a further complain,
i can re-do what I thought to be addressed,
and replace it with something I believe now
addresses the comment. Its an iterative
process, misunderstanding happens, etc.
Of course if the policy is to not write "addressed
this and that comment" before getting an
ACK, then I will no longer do that, and write
"tried to address this and that, hope its now better".


>   or flooding the maillist with multiple
> version without addressing previous feedback is really, and I want to
> stress the really here, tiring.

Ok, should I post the next drafts to the bugzilla
then, if flooding ML is bad?
Its not a problem, really!
You tell me what to do, I do.
Why such a policy things are even becoming
a problems? I didn't mean to offend anyone
by flooding the ML, I'll post to bugzilla, and I
don't want you to be offended on me just because
of that. :)
  
Adhemerval Zanella Netto April 13, 2023, 7:12 p.m. UTC | #42
On 13/04/23 15:59, stsp wrote:
> Hi Adhemerval,
> 
> 13.04.2023 23:09, Adhemerval Zanella Netto пишет:
>> It is being tiring to work with your proposal because Rich already
>> brought the inherent issue of exposing the loader internals for ELF
>> objects [1] about 10 years ago,
> 
> My proposal did not exist 10 years ago.
> Maybe we all do not properly document
> our proposals or an objections. So let me
> ask you to please refer to a particular
> comment rather than to some 10 years
> ago thread, and then carefully document
> how can that be the problem in my patches.
> 
> I think its a fair ask, e.g. Szabolcs asked
> me for a more precise spec - I write. We
> progress. Why can't you write the objections
> in a detalization level that is enough to
> make a progress?

Sigh, we are going into circles again.

> 
> I admit you probably couldn't do that initially,
> because I poorly documented my API. But
> when Carlos asked for a more detailed spec,
> I did. Now you can express your objections
> in a detailed manner, let me even attach
> the prev API draft to help you doing that.
> Of course this draft will be simplified a lot,
> but for such a "generic" objections please
> use its current form.

The problem is not a poor documented ABI, the ABI itself has inherent
corner cases brought multiple times.  The problem you are trying to
solve would be better served with a custom loader.

> 
> I have expressed the plans at removing all the
> corner cases that Szabolcs pointed to. If you
> point more, I'll definitely take an actions.
> Off-loading the biggest part to RTLD_NORELOC
> will reduce the proposal considerably, avoid
> the callback and most of other complexity, so
> why such a prejudice? Why can't we just discuss,
> amend, see what happens?

The RTLD_NORELOC itself is not a better approach, is just moving the
same corner cases that we brought to a different symbol... against
walking in cycles.

> 
> Ok, should I post the next drafts to the bugzilla
> then, if flooding ML is bad?
> Its not a problem, really!
> You tell me what to do, I do.
> Why such a policy things are even becoming
> a problems? I didn't mean to offend anyone
> by flooding the ML, I'll post to bugzilla, and I
> don't want you to be offended on me just because
> of that. :)

Do not bother, I think the best course of action is just to drop
the RFE.
  
stsp April 13, 2023, 7:29 p.m. UTC | #43
Adhemerval,

14.04.2023 00:12, Adhemerval Zanella Netto пишет:
>> I admit you probably couldn't do that initially,
>> because I poorly documented my API. But
>> when Carlos asked for a more detailed spec,
>> I did. Now you can express your objections
>> in a detailed manner, let me even attach
>> the prev API draft to help you doing that.
>> Of course this draft will be simplified a lot,
>> but for such a "generic" objections please
>> use its current form.
> The problem is not a poor documented ABI, the ABI itself has inherent
> corner cases brought multiple times.

By whom?
Where?
You referred to some 10 years old thread,
where I didn't even participated. Is it too
much to ask to at least point me to a particular
comment in that thread?


>    The problem you are trying to
> solve would be better served with a custom loader.

I wish to have a custom loader.
But so far I don't know how to make it
friendly to glibc. Custom loader needs
to be able to create a link-map, and glibc
currently doesn't have a hook to do that.
I currently only know how to create a
custom loader that works until glibc
changes either struct link_map or struct
rtld_global_ro. Such custom loader can
be done, but it won't be very reliable.

>> I have expressed the plans at removing all the
>> corner cases that Szabolcs pointed to. If you
>> point more, I'll definitely take an actions.
>> Off-loading the biggest part to RTLD_NORELOC
>> will reduce the proposal considerably, avoid
>> the callback and most of other complexity, so
>> why such a prejudice? Why can't we just discuss,
>> amend, see what happens?
> The RTLD_NORELOC itself is not a better approach, is just moving the
> same corner cases that we brought to a different symbol... against
> walking in cycles.

Why not to show these problems?
Szabolcs always does. You always assert on
their existence, as if I have no right to even
ask what they are. Maybe they are obvious, maybe
I have to realize them w/o asking, but I can't,
I am a novice.
Please, pretty please, can you detail those problems?
It would be best if you detail them with the
reference to my drafts - then I'll immediately
understand.


> Do not bother, I think the best course of action is just to drop
> the RFE.
Why is that constructive?
Well is not for Szabolcs, I could suspect its
just me. But now if he points to the problems
promptly, then I am sure its possible to do.
  
Adhemerval Zanella Netto April 13, 2023, 8:02 p.m. UTC | #44
On 13/04/23 16:29, stsp wrote:
> Adhemerval,
> 
> 14.04.2023 00:12, Adhemerval Zanella Netto пишет:
>>> I admit you probably couldn't do that initially,
>>> because I poorly documented my API. But
>>> when Carlos asked for a more detailed spec,
>>> I did. Now you can express your objections
>>> in a detailed manner, let me even attach
>>> the prev API draft to help you doing that.
>>> Of course this draft will be simplified a lot,
>>> but for such a "generic" objections please
>>> use its current form.
>> The problem is not a poor documented ABI, the ABI itself has inherent
>> corner cases brought multiple times.
> 
> By whom?
> Where?
> You referred to some 10 years old thread,
> where I didn't even participated. Is it too
> much to ask to at least point me to a particular
> comment in that thread?

Because it is the same remark bought *multiple* times in this thread [1]
that you keep saying it is not important on the discussion:

"It would be possible to require the caller to arrange all of 
these things, but that's basically offloading A LOT of the ELF 
loading process onto the calling program and I don't think that 
makes for a reasonable public interface for glibc to provide."

And every time someone brought it [2], you just reply that this since it
does not fit in your usercase this is not an acceptable solution.  And
then we kept in circles, with multiple developers saying that this
interface-like is not reasonable, and you saying that it is the only way 
to solve your specific usercase.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=11767#c16
[2] https://sourceware.org/pipermail/libc-alpha/2023-April/146886.html

> 
> 
>>    The problem you are trying to
>> solve would be better served with a custom loader.
> 
> I wish to have a custom loader.
> But so far I don't know how to make it
> friendly to glibc. Custom loader needs
> to be able to create a link-map, and glibc
> currently doesn't have a hook to do that.
> I currently only know how to create a
> custom loader that works until glibc
> changes either struct link_map or struct
> rtld_global_ro. Such custom loader can
> be done, but it won't be very reliable.

You can just use glibc, apply your patches, and use it instead.

> 
>>> I have expressed the plans at removing all the
>>> corner cases that Szabolcs pointed to. If you
>>> point more, I'll definitely take an actions.
>>> Off-loading the biggest part to RTLD_NORELOC
>>> will reduce the proposal considerably, avoid
>>> the callback and most of other complexity, so
>>> why such a prejudice? Why can't we just discuss,
>>> amend, see what happens?
>> The RTLD_NORELOC itself is not a better approach, is just moving the
>> same corner cases that we brought to a different symbol... against
>> walking in cycles.
> 
> Why not to show these problems?
> Szabolcs always does. You always assert on
> their existence, as if I have no right to even
> ask what they are. Maybe they are obvious, maybe
> I have to realize them w/o asking, but I can't,
> I am a novice.
> Please, pretty please, can you detail those problems?
> It would be best if you detail them with the
> reference to my drafts - then I'll immediately
> understand.

Because every time someone bring the corner cases you just dismiss saying 
that either the person does not understand you [1] and it is up to review
prove you are wrong; or you dismiss saying that the remarks are not
applicable.

[3] https://sourceware.org/pipermail/libc-alpha/2023-March/146729.html
[4] https://sourceware.org/pipermail/libc-alpha/2023-March/146717.html

> 
> 
>> Do not bother, I think the best course of action is just to drop
>> the RFE.
> Why is that constructive?
> Well is not for Szabolcs, I could suspect its
> just me. But now if he points to the problems
> promptly, then I am sure its possible to do.

Because we are not moving anywhere, if you have not already get it
the replies are to try to show you the problems with the RFE proposal,
not to move toward having dlmem-like interface incorporated.
  
stsp April 13, 2023, 8:21 p.m. UTC | #45
Adhemerval,

14.04.2023 01:02, Adhemerval Zanella Netto пишет:
> Because it is the same remark bought *multiple* times in this thread [1]
> that you keep saying it is not important on the discussion:
>
> "It would be possible to require the caller to arrange all of
> these things, but that's basically offloading A LOT of the ELF
> loading process onto the calling program and I don't think that
> makes for a reasonable public interface for glibc to provide."

That cr^H^Hcomment was indeed brought
to me several times. But there is a second part
of a deal, which is to explain how it is ever
relevant to my patches. Maybe it was not
possible before I wrote a proper spec draft,
but its definitely possible now.

> And every time someone brought it [2], you just reply that

... that its a bu^H^Hirrelevant comment, having
ZERO relevance to my patches?
Well, its true. Now when the patches are properly
specced, I think I have a right to ask why that
comment is even raised at all. Does it have anything
to do with my spec draft? No. Does it have anything
to do with my patches? No. Does it have anything
to do with my usage examples? No. Does it anything
to do with some theoretical usage examples of my
API? Still no!
Does it have to do with anything at all? No.
What should I do? :(

Oh, idea!
Szabolcs never lies, I trust him.
If he says I should address (or even read) that
comment, and that its a comment that is "somehow"
relevant to my API and patches, then I'll bring my
patches down.
But so far I was only seeing that comment as an
attempt to discredit my work by any price, even
by the pure hideous lie. :(


>   this since it
> does not fit in your usercase this is not an acceptable solution.

And I never said anything like this, too.


>    And
> then we kept in circles, with multiple developers saying that this
> interface-like is not reasonable, and you saying that it is the only way
> to solve your specific usercase.

No.
We are walking in circles with intentionally malicious
argument towards my work, because you wanted to
put my patches down by any price.
This is unacceptable.

Szabolcs will not repeat that malicious argument,
or if he will, I'll put my patches down.
  
stsp April 13, 2023, 8:57 p.m. UTC | #46
14.04.2023 01:02, Adhemerval Zanella Netto пишет:
> Because every time someone bring the corner cases you just dismiss saying

Compare that with corner cases brought
by Szabolcs. Have I ignored a single one?

> that either the person does not understand you [1]

So was a hope.
I didn't know it was intentional.


>   and it is up to review
> prove you are wrong;
Why do you intentionally omit the posting
where I actually provide a proof, after getting
off of an initial shock?

https://sourceware.org/pipermail/libc-alpha/2023-March/146742.html


>   or you dismiss saying that the remarks are not
> applicable.

Yes.
The quoted remarks are not applicable.
I proved that, see above URL.
And I explicitly asked a few times, do you
need more proofs, and if so - what kind of?
Its still relevant - I can code up any proof
you want, if the above obvious proof wasn't
sufficient.
That conspiracy theory could only work if not
for Szabolcs who have started the actual
review, and not for the spec drafts that I published,
etc. After all these I was hoping you will no
longer bring that to public, but you did. :(
  
Carlos O'Donell April 13, 2023, 9:17 p.m. UTC | #47
On 3/31/23 07:04, stsp wrote:
> Hi Carlos, Jonathon.
> 
> 29.03.2023 23:13, Carlos O'Donell пишет:
>> The most important thing is the reasons for the change and that should come first.
> 
> Done, please see the attachment.

Your v10 cover letter is great and makes it completely clear what it is
you want to implement.

However, the API itself is just not acceptable to me as an abstraction
that is generic enough to warrant inclusion in glibc. This is a design
decision I'm making as a developer.

In this thread you don't have consensus from Adhemerval, or myself as
core glibc developers. So this patch won't move forward towards inclusion
until you can get our consensus. This *blocks* this patch from ever being
included in glibc per consensus rules:
https://sourceware.org/glibc/wiki/Consensus

You don't have consensus from Zach Weinberg (GNU Autoconf maintainer) who
has also responded on the thread and is a GNU Toolchain contributor.

Likewise you don't have conses from Rich Felker, who as an alternative
libc developer would need to consider implementing something similar.

I don't see any way to salvage this API as it is, and while it might
serve your needs, the core C library is not designed to serve all needs,
but to serve the most generic needs.

The most generic needs can be seen by existing prior art like fdlopen
or dlopen_with_offset(), and they have been implemented.

>> This needs to explain the workload that requires the API.
>>
>> Why that workload is generic.
>>
>> Any core system library, like glibc, is not the place for *all* APIs, but the
>> place for the most generic building blocks for ISO C, POSIX, BSD, GNU, and
>> Linux APIs.
> 
> I hope I've got that suggestion rightly.
> Please see if my current description is
> adequate.

Not quite. A workload is application that would use the API in a specific way.

For example you don't mention which workloads would be improved by using these
APIs? Would an application use the APIs to accelerate or handling plugin loading?
What about loading DSOs from NVRAM? What about replacing all the vDSO handling
with the API?

> And on the other front...
> I studied and documented (in the attachment)
> all the cases where my impl fails to arrange
> an elf segments properly... I have to admit
> such cases were possible. :(
> I documented them and their mitigations in
> the "Limitations" section.
> Let me know if this is now adequate.

These are well documented. You've really done a good job of calling them out.

In summary:

- This API as designed does not have consensus. Posting v10 does not fix the
  consensus problem.

- I suggest dropping the feature for glibc. If there are other things you
  would like to work on in glibc, please reach out again.
  
stsp April 13, 2023, 9:58 p.m. UTC | #48
Hi Carlos,

14.04.2023 02:17, Carlos O'Donell пишет:
> In this thread you don't have consensus from Adhemerval, or myself as
> core glibc developers.

Have you forgot to reply to this e-mail:

https://sourceware.org/pipermail/libc-alpha/2023-March/146742.html

I sent you a proof, I asked multiple times
(including privately) was a proof adequate.
But you haven't replied, and haven't dropped
that malicious "elf parsing" argument.
Can you do that now?
Was my proof sufficient, or should I code
up another one?


>   So this patch won't move forward towards inclusion
> until you can get our consensus. This *blocks* this patch from ever being
> included in glibc per consensus rules:

You perfectly know what _actually_ blocks it.
And its a small and simple fact: you have never
replied to my proof, that I sent to you by your
own request. You haven't put down the malicious
"elf parsing" argument, and haven't said the proof
was insufficient (or sufficient).

> You don't have consensus from Zach Weinberg (GNU Autoconf maintainer) who
> has also responded on the thread and is a GNU Toolchain contributor.

I wasn't CCed with this, and doing a "thread view"
via web interface doesn't allow me to find his e-mail.
Could you please provide an URL?

> Likewise you don't have conses from Rich Felker, who as an alternative
> libc developer would need to consider implementing something similar.

The API is in draft.
Draft can't be in consensus with everyone,
especially until you tell me whether the proof
was sufficient or not.

> I don't see any way to salvage this API as it is,

As it is it shouldn't be included.
Its a draft that should be _discussed_.
Its discussed only by Szabolcs, and very
unfortunately there are also the aforementioned
affairs going on about your "elf parsing" thing.


> Not quite. A workload is application that would use the API in a specific way.
>
> For example you don't mention which workloads would be improved by using these
> APIs? Would an application use the APIs to accelerate or handling plugin loading?
> What about loading DSOs from NVRAM?

OK I'll try to summarize those in the next draft.
Basically the estimation is that this API is very
much comparable with the existing dlopen() in
speed, as it doesn't use slow functions like read()
or memcpy() (well, memcpy() behavior exists
but its not a default one). Whether it increases
or decreases the syscall amount per DSO is
difficult to say (will need to measure), but even
if it does, the overhead would be a couple of
syscalls per DSO.


>   What about replacing all the vDSO handling
> with the API?

My estimation is that you will measure no changes.
Of course the final draft will need the benchmark
numbers, but its too early to add just yet.


> - I suggest dropping the feature for glibc. If there are other things you
>    would like to work on in glibc, please reach out again.
Why not to reply to the aforementioned e-mail
to at least make such a statement fair?
  
stsp April 13, 2023, 10:08 p.m. UTC | #49
OK, I've found the mail from Zack Weinberg,
it wasn't CCed to me. Thanks for telling about
it. Found it in an archive.

14.04.2023 02:17, Carlos O'Donell пишет:
> If there are other things you
>    would like to work on in glibc, please reach out again.
Of course there are:
https://sourceware.org/pipermail/libc-alpha/2023-April/147243.html
Would be good to consider that spec as well.
  
stsp April 13, 2023, 10:50 p.m. UTC | #50
Hi Zack, thanks for a reply.
Have you forgot to CC the e-mail to me,
or was it some spam filtering? Carlos have
just told me about it.
I'll use copy/paste from web interface to reply,
and will monitor the web archive for a few days
in case your mails to me are blocked somewhere.


> Let me try to highlight specific issues from the feedback you've been
> getting.  It appears to me that your *API* proposal -- not any details
> of its implementation, the API itself -- is unacceptable *BECAUSE*:
>
> * It introduces a callback from the dynamic loader into user code, that
>    is to be run with an internal lock held. glibc does already have
>    callbacks like that, but we do not want to introduce any more of
>    them, period.
> * It assumes that the dynamic loader will always, in the future, operate
>    by creating a large initial mapping and then modifying protection
>    flags on it.
Yes, those are the totally valid concerns
raised by Szabolcs. They were very difficult
to address actually, and eventually I came to
this proposal:
https://sourceware.org/pipermail/libc-alpha/2023-April/147243.html

It is needed to address exactly those, but
it may not be obvious yet how exactly they
are related. I think this is not a problem, you'll
see that in a next dlmem() draft. These problems
will be solved. Most of dlmem() can be off-loaded
to either proposed RTLD_NORELOC or something
similar.

> * It assumes that it the dynamic loader is, and will always be, capable
>    of taking a pointer to a block of memory -- no matter how it was
>    originally allocated -- and then chopping its virtual memory map up to
>    satisfy the requirements of the embedded ELF program header table,
>    without breaking anything else.
I think you are talking about a premap callback
here? It will be dropped per comment 1 of yours,
so I think this is mostly a duplicate.

> [On this specific note, I foresee
>    someone trying to write a dynamic module into memory allocated with
>    memalign() or even plain old malloc(); calling dlmem() on this is very
>    likely to apply PROT_NONE to address ranges containing malloc
>    metadata!]
No-no, it never changes the protection
of a source buffer! Nothing to worry about
here. Also there are some recommendations
in an API draft what mappings are supported
as a source and what not - using memalign()
space is very much unrecommended unless
you know what you do (see api draft for details).
But the protections are of course not altered.

> For instance, maybe we could have an fdlopen variant that
> called callback functions instead of mmap() and mprotect().  Unlike
> your dlmem_premap_t, I think it ought to be possible to avoid doing
> those specific operations with any locks held.
But what such a callback would do?
Plain read()?


So.
You are welcome to this lengthy thread. :)
I am attaching an API draft in case you are
curious about a details, but it was not yet
updated to the removal of a callback.
Sorry about that, just take my word on that
for now. :)

There are a few things that you miss about
the proposed design (as every newcomer
to this thread does), so let me show you an
examples of fdlopen() and dlopen_with_offset()
implemented on top of dlmem(). Most of things,
like eg the source buffer usage, will be more
clear after that:

static void *
fdlopen (int fd, int flags)
{
   off_t len;
   void *addr;
   void *handle;

   len = lseek (fd, 0, SEEK_END);
   lseek (fd, 0, SEEK_SET);
   addr = mmap (NULL, len, PROT_READ, MAP_PRIVATE, fd, 0);
   if (addr == MAP_FAILED)
     return NULL;
   handle = dlmem (addr, len, flags, NULL);
   munmap (addr, len);
   return handle;
}

If we run that example, then in /proc/self/maps
we will see the correct references to an elf file:

$ LD_LIBRARY_PATH=..:. ./tst-dlmem-fdlopen
unaligned buf gives buffer not aligned: Invalid argument
7fb413101000-7fb413102000 r--p 00000000 00:28 17195405
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413102000-7fb413103000 r-xp 00001000 00:28 17195405
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413103000-7fb413104000 r--p 00002000 00:28 17195405
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413104000-7fb413105000 r--p 00002000 00:28 17195405
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413105000-7fb413106000 rw-p 00003000 00:28 17195405
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so

As you an see, there is no reference to the
full file, just to the elf segments. That's
because the above code just munmap()s the source
buffer after dlmem() returns. Its protection
was never changed.

Now, dlopen_with_offset() is slightly more difficult,
but is suggested for a read too:

/* Load the shared library from a container file.
    file   - file name.
    offset - solib offset within a container file.
             Highly recommended to be page-aligned.
    length - solib file length (not a container file length).
    flags  - dlopen() flags. */
void *dlopen_with_offset4 (const char *file, off_t offset, size_t length,
                            int flags)
{
     void *addr;
     void *handle;
     int fd;
     off_t pad_size = (offset & (getpagesize () - 1));
     off_t aligned_offset = offset - pad_size;
     size_t map_length = length + pad_size;

     fd = open (file, O_RDONLY);
     if (fd == -1)
         return NULL;
     addr = mmap (NULL, map_length, PROT_READ, MAP_PRIVATE, fd, aligned_offset);
     close(fd);
     if (addr == MAP_FAILED)
         return NULL;
     if (pad_size)
       {
         /* We need to fix alignment by hands. :-(
            And for that we need a shared mapping. */
         void *addr2 = mmap (NULL, length, PROT_READ | PROT_WRITE,
                             MAP_SHARED | MAP_ANONYMOUS, -1, 0);
         if (addr2 == MAP_FAILED)
           {
             munmap (addr, map_length);
             return NULL;
           }
         memcpy (addr2, addr + pad_size, length);
         munmap (addr, map_length);
         addr = addr2;
         map_length = length;
       }
     handle = dlmem (addr, length, flags, NULL);
     munmap (addr, map_length);
     return handle;
}
  
stsp April 14, 2023, 7:07 a.m. UTC | #51
Hi Adhemerval,

14.04.2023 01:02, Adhemerval Zanella Netto пишет:
> Because it is the same remark bought *multiple* times in this thread [1]
> that you keep saying it is not important on the discussion:
>
> "It would be possible to require the caller to arrange all of
> these things, but that's basically offloading A LOT of the ELF
> loading process onto the calling program and I don't think that
> makes for a reasonable public interface for glibc to provide."
OK, I got off of an another shock from this, and am
absolutely calm again, just having a headache...
So lets get to the facts.

1.  There is now a written spec that explicitly
      says what is required. It says exactly how
      the solib should be put to the memory buffer,
      what types of mapping are supported, what
      types unsupported, how to dlmem() from
      aligned buffer, from unaligned buffer, from
      file-mapped buffer, and so on. So the whole
      "A LOT of the ELF loading process" is now
      fully specified and detailed.

2. The quote you did above, doesn't reveal the
     whole story. This URL does:

https://inbox.sourceware.org/libc-alpha/b6a96687-9a4b-414f-849a-45a305898274@redhat.com/

In this e-mail Carlos does claims that directly
contradict my spec. And based on them, not
based on my spec (!!!) he concludes that my
API is unacceptable. How can the one conclude
some API is unacceptable, writing his own (wrong)
spec of it? That's very strange, isn't it?

If my spec clearly say you only need to mmap()
the solib, and some alternative spec says you
need to also arrange elf segments by hands
before calling dlmem(), then how can this even
be the truth if we discuss _my_ spec? Maybe
Carlos reverse-engineered my implementation
of a spec, and found that it doesn't conform to
my own spec? And wrote his own one that matches
my implementation? I very much doubt so, but
even if this is true, then why not to point me to
such a bug in an implementation, and ask to
make it so that it matches _my_ spec? Why not
to ask me to make the loader itself to arrange
an elf segments, if it is not the case right now?
(but of course it is, that was proved many times)

So overall, I discuss my spec, Carlos discusses
his own. Why is it productive?

If it is widely believed my impl doesn't match
my own spec, then can we for a moment forget
about an impl, and discuss only my spec? It
clearly says that mapping the segment is a
loader's responsibility. I even explicitly wrote
that the mmapped image should not be modified,
or dlmem() won't work. Do we discuss _that_
spec, or some other spec of Carlos? That's what
I'd like to hear an answer to.

> And every time someone brought it [2], you just reply that this since it
> does not fit in your usercase this is not an acceptable solution.
This is yet another misunderstanding.
I say that when people offer me memfd+dlopen
replacement. That doesn't work for me, but it
has zero relevance to the discussed "Carlos's
spec of my API".

Now. I apologize if my yesterday's mails were
a bit rough. I hope we can all be technical and
I wish we all can always be calm. But when my
spec is replaced and another one is being used
to "justify" my work, I am really feeling quite bad.

Lets get technical, lets stick to facts and to my spec,
please. :)
If my spec says no elf parsing or segment
arranging is needed, then lets _believe_ that
my implementation matches it, please.
  
stsp April 14, 2023, 7:36 a.m. UTC | #52
14.04.2023 01:02, Adhemerval Zanella Netto пишет:
> Because every time someone bring the corner cases you just dismiss saying
> that either the person does not understand you [1] and it is up to review
> prove you are wrong; or you dismiss saying that the remarks are not
> applicable.
Yes, because I wrote into my spec that this
particular remark in not applicable. In particular
I detailed the process of loading a solib, and
stated that the initially mmapped buffer should
not be modified.
Whatever I write to a spec, should be regarded
as "the truth by definition" because we discuss
_my_ spec, not Carlos's spec.
I can't be even asked to "prove what I wrote into
a spec" - its an incorrect requirement. But I
did an attempt to prove that my _implementation_
is conforming to my spec. No one "noticed"...
But it doesn't even matter, because we are
discussing a spec, not an implementation!
So nothing from my spec should be a subject
of some "proving". And per my spec, no "elf
parsing" is needed, and no manual arranging
of PT_LOAD segments is needed - dlmem()
does it itself. This can't be put under any disbelief,
because its a spec!
  
stsp April 14, 2023, 11:30 a.m. UTC | #53
Adhemerval,

14.04.2023 01:02, Adhemerval Zanella Netto пишет:
> Because it is the same remark bought *multiple* times in this thread [1]
> that you keep saying it is not important on the discussion:
>
> "It would be possible to require the caller to arrange all of
> these things, but that's basically offloading A LOT of the ELF
> loading process onto the calling program and I don't think that
> makes for a reasonable public interface for glibc to provide."
Now let me quote the relevant part of _my_
spec:

- If you use a private file-backed mapping, then it shouldn't be
   modified by hands before passing to dlmem(). I.e. you can't apply
   mprotect() to it to change protection bits, and you can't apply
   memmove() to it to move the solib to the beginning of the buffer,
   and so on. dlmem() can only work with "virgin" private file-backed
   mappings.


That's what _my_ spec says.
Why can't we just settle on a fact that anything
that contradicts my spec, be it taken from some
10-years-old thread or from elsewhere, cannot
be used as an argument when judging my API?
Why can't we settle on a fact that even the
attempts to do so are unacceptable, especially
the repetitive attempts?
We discuss my spec, so replacing its parts during
the discussion, is counter-productive.

> Because every time someone bring the corner cases you just dismiss saying
> that either the person does not understand you [1] and it is up to review
> prove you are wrong;
Because the reviewer must not operate on
some other spec when judging my spec. If
the reviewer thinks his spec describes my
proposed implementation better than my
own spec, then he should better actually try
to prove that. But at the end I presented my
own proof that my impl is conforming:
https://sourceware.org/pipermail/libc-alpha/2023-March/146742.html

And now the question is simple: what else
should be done on my side to have that
case finally resolved? Code up some other
proof? Write more clearly in a spec that no
elf parsing is needed?
Its a simple question, just what should I do?
You tell - I do.
  
Zack Weinberg April 14, 2023, 4:15 p.m. UTC | #54
On 2023-04-13 5:17 PM, Carlos O'Donell via Libc-alpha wrote:
> Zach Weinberg (GNU Autoconf maintainer)
I feel that referring to me as the GNU Autoconf maintainer is likely to 
mislead people.

At present, GNU Autoconf does not _have_ any active maintainers, and 
this is not likely to change anytime soon.  I _wanted_ to put out a new 
Autoconf release (to better support clang 16.0.0, which has turned off 
support for implicit function declarations in its default mode) months 
ago and it hasn't been possible because neither I nor anyone else has 
the time to make it happen.

zw
  
Carlos O'Donell April 14, 2023, 8:24 p.m. UTC | #55
On 4/14/23 12:15, Zack Weinberg wrote:
> On 2023-04-13 5:17 PM, Carlos O'Donell via Libc-alpha wrote:
>> Zach Weinberg (GNU Autoconf maintainer)
> I feel that referring to me as the GNU Autoconf maintainer is likely
> to mislead people.

How would you like to be described?

I struggled for a term to describe your role in the ecosystem of packages around
the toolchain space.

I often quote your "A Maintenance Programmer’s View of GCC" (GCC Summit 2003)
around incomplete transitions :-)

> At present, GNU Autoconf does not _have_ any active maintainers, and
> this is not likely to change anytime soon.  I _wanted_ to put out a
> new Autoconf release (to better support clang 16.0.0, which has
> turned off support for implicit function declarations in its default
> mode) months ago and it hasn't been possible because neither I nor
> anyone else has the time to make it happen.

Thanks for that update.

I think that a new Autoconf release is something that the community would want,
particularly as Fedora and Gentoo move towards Modern C:

https://fedoraproject.org/wiki/Changes/PortingToModernC
  
Zack Weinberg April 14, 2023, 8:40 p.m. UTC | #56
On 2023-04-14 4:24 PM, Carlos O'Donell via Libc-alpha wrote:
> On 4/14/23 12:15, Zack Weinberg wrote:
>> On 2023-04-13 5:17 PM, Carlos O'Donell via Libc-alpha wrote:
>>> Zach Weinberg (GNU Autoconf maintainer)
>> I feel that referring to me as the GNU Autoconf maintainer is 
>> likely to mislead people.
> 
> How would you like to be described?

As an individual contributor to many GNU packages, who doesn't have time
to pitch in nearly as much as he'd like.

With regard to autoconf, I may have been the most recent person who
managed to put on a show, and that does mean I feel responsible for
keeping the monkeys fed for the time being, but I have neither time
nor plans to put the circus back into regular operation.

> I think that a new Autoconf release is something that the community 
> would want, particularly as Fedora and Gentoo move towards Modern C:

Yah.  But that conversation is already happening over on autoconf's
mailing lists and is fully off-topic for libc-alpha.

zw
  
stsp May 8, 2023, 3:05 p.m. UTC | #57
Hello,

14.04.2023 02:17, Carlos O'Donell пишет:
> In summary:
>
> - This API as designed does not have consensus. Posting v10 does not fix the
>    consensus problem.
>
> - I suggest dropping the feature for glibc. If there are other things you
>    would like to work on in glibc, please reach out again.
Your wish is my command.
dlmem patches are dropped, and the
alternative proposal is here:
https://sourceware.org/pipermail/libc-alpha/2023-May/147899.html
Full implementation and test coverage
is here:
https://sourceware.org/bugzilla/show_bug.cgi?id=30007
  
stsp May 8, 2023, 3:10 p.m. UTC | #58
Hi Szabolcs, I've finally completed an impl
of RTLD_NORELOCATE:
https://sourceware.org/pipermail/libc-alpha/2023-May/147899.html
It appears to be a much more flexible
approach than dlmem(), so I just managed
to build the entire functionality I need, over
that new proposal, rather than using it only
for a part of functionality, as I thought before.
Implementation with test coverage and full
descriptions, is here:

https://sourceware.org/bugzilla/show_bug.cgi?id=30007

Is this more viable, what do you think?
  
stsp May 19, 2023, 7:26 a.m. UTC | #59
Hello,

14.04.2023 02:17, Carlos O'Donell пишет:
>   If there are other things you
>    would like to work on in glibc, please reach out again.
So I posted the entire patch-set to an ML now:

https://sourceware.org/pipermail/libc-alpha/2023-May/148241.html

Patch bot seems to succeed with these:
https://patchwork.sourceware.org/project/glibc/patch/20230518082854.3903342-15-stsp2@yandex.ru/

Is there anything else to do on my side to
reach out again?