[0/4] config: Allow a host to opt out of PCH.

Message ID 20211104200218.24159-1-iain@sandoe.co.uk
Headers
Series config: Allow a host to opt out of PCH. |

Message

Iain Sandoe Nov. 4, 2021, 8:02 p.m. UTC
  GCC (currently) has an implementation of pre-compiled-headers, that relies
on being able to launch the compiler executable at the same address each
time.  This constraint is not permitted by some system security models.

The facility is an optimisation; saving the output of parsing a covering
header file (that may include many others) so that the parsing need not be
repeated when the same set of headers is needed in many places in a project.

The patch series disables the operation of the PCH-related command lines,
but does not cause an error to be emitted.  The intent is that build
recipes that expect PCH to work will continue to operate, but the compiler
no longer acts on them and therefore is no longer bound to the requirement
to launch at a fixed address.

 * When invoked to "generate PCH" the compiler will carry out the parsing
   as before - producing any diagnostics if relevant and then saving a
   stub file (to satisfy build recipe targets).  The stub file is marked as
   invalid PCH.

 * When an include directive is encountered, the compiler no longer checks
   to see if a PCH header is available.

 * The top-level configure option (--disable-host-pch-support) is also
   propagated to libstdc++ where it causes the automatic invocation of the
   existing --disable-libstdxx-pch.

tested on x86_64-darwin, aarch64-darwin, and on x86_64, powerpc64le-linux,
OK for master?
thanks
Iain

Iain Sandoe (4):
  config: Add top-level flag to disable host PCH.
  libstdc++: Adjust build of PCH files accounting configured host
    support.
  libcpp: Honour a configuration without host support for PCH.
  c-family, gcc: Allow configuring without support for PCH.

 Makefile.def              |  9 ++--
 Makefile.in               | 87 +++++++++++++++++++++++++--------------
 configure                 | 42 +++++++++++++++++++
 configure.ac              | 35 ++++++++++++++++
 gcc/c-family/c-pch.c      | 23 ++++++++++-
 gcc/config.in             |  6 +++
 gcc/config/host-darwin.c  | 18 ++++++++
 gcc/configure             | 29 ++++++++++++-
 gcc/configure.ac          | 17 ++++++++
 gcc/doc/install.texi      |  6 +++
 libcpp/config.in          |  3 ++
 libcpp/configure          | 24 +++++++++++
 libcpp/configure.ac       | 16 +++++++
 libcpp/files.c            | 14 +++++++
 libcpp/pch.c              | 12 ++++++
 libstdc++-v3/acinclude.m4 | 49 +++++++++++++---------
 libstdc++-v3/configure    | 71 +++++++++++++++++++++-----------
 libstdc++-v3/configure.ac | 11 ++++-
 18 files changed, 391 insertions(+), 81 deletions(-)
  

Comments

Richard Biener Nov. 5, 2021, 9:42 a.m. UTC | #1
On Thu, Nov 4, 2021 at 9:03 PM Iain Sandoe via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> GCC (currently) has an implementation of pre-compiled-headers, that relies
> on being able to launch the compiler executable at the same address each
> time.  This constraint is not permitted by some system security models.
>
> The facility is an optimisation; saving the output of parsing a covering
> header file (that may include many others) so that the parsing need not be
> repeated when the same set of headers is needed in many places in a project.
>
> The patch series disables the operation of the PCH-related command lines,
> but does not cause an error to be emitted.  The intent is that build
> recipes that expect PCH to work will continue to operate, but the compiler
> no longer acts on them and therefore is no longer bound to the requirement
> to launch at a fixed address.
>
>  * When invoked to "generate PCH" the compiler will carry out the parsing
>    as before - producing any diagnostics if relevant and then saving a
>    stub file (to satisfy build recipe targets).  The stub file is marked as
>    invalid PCH.
>
>  * When an include directive is encountered, the compiler no longer checks
>    to see if a PCH header is available.
>
>  * The top-level configure option (--disable-host-pch-support) is also
>    propagated to libstdc++ where it causes the automatic invocation of the
>    existing --disable-libstdxx-pch.
>
> tested on x86_64-darwin, aarch64-darwin, and on x86_64, powerpc64le-linux,
> OK for master?

I had the impression we have support for PCH file relocation to deal with ASLR
at least on some platforms.  But it's IMHO nice to have a way to disable PCH
and that paves the way to have it disabled by default for a release before we
eventually nuke support completely (and then provide a backward-compatible
stub implementation).

So - OK if there are no complaints from reviewers of their respective area the
series touches.

Thanks,
Richard.

> thanks
> Iain
>
> Iain Sandoe (4):
>   config: Add top-level flag to disable host PCH.
>   libstdc++: Adjust build of PCH files accounting configured host
>     support.
>   libcpp: Honour a configuration without host support for PCH.
>   c-family, gcc: Allow configuring without support for PCH.
>
>  Makefile.def              |  9 ++--
>  Makefile.in               | 87 +++++++++++++++++++++++++--------------
>  configure                 | 42 +++++++++++++++++++
>  configure.ac              | 35 ++++++++++++++++
>  gcc/c-family/c-pch.c      | 23 ++++++++++-
>  gcc/config.in             |  6 +++
>  gcc/config/host-darwin.c  | 18 ++++++++
>  gcc/configure             | 29 ++++++++++++-
>  gcc/configure.ac          | 17 ++++++++
>  gcc/doc/install.texi      |  6 +++
>  libcpp/config.in          |  3 ++
>  libcpp/configure          | 24 +++++++++++
>  libcpp/configure.ac       | 16 +++++++
>  libcpp/files.c            | 14 +++++++
>  libcpp/pch.c              | 12 ++++++
>  libstdc++-v3/acinclude.m4 | 49 +++++++++++++---------
>  libstdc++-v3/configure    | 71 +++++++++++++++++++++-----------
>  libstdc++-v3/configure.ac | 11 ++++-
>  18 files changed, 391 insertions(+), 81 deletions(-)
>
> --
> 2.24.3 (Apple Git-128)
>
  
Jakub Jelinek Nov. 5, 2021, 9:54 a.m. UTC | #2
On Fri, Nov 05, 2021 at 10:42:05AM +0100, Richard Biener via Gcc-patches wrote:
> I had the impression we have support for PCH file relocation to deal with ASLR
> at least on some platforms.

Unfortunately we do not, e.g. if you build cc1/cc1plus as PIE on
x86_64-linux, PCH will stop working unless one always invokes it with
disabled ASLR through personality.

I think this is related to function pointers and pointers to .rodata/.data
etc. variables in GC memory, we currently do not relocate that.

What we perhaps could do is (at least assuming all the ELF PT_LOAD segments
are adjacent with a single load base for them - I think at least ia64
non-PIE binaries were violating this by having .text and .data PT_LOAD
segments many terrabytes appart with a whole in between not protected in any
way, but dunno if that is for PIEs too), perhaps try in a host
specific way remember the address range in which the function pointers and
.rodata/.data can exist, remember the extent start and end from PCH generation
and on PCH load query those addresses for the current compiler and relocate
everything in that extent by the load bias from the last run.
But, the assumption for this is that those function and data/rodata pointers
in GC memory are actually marked at least as pointers...
Do we e.g. have objects with virtual classes in GC memory and if so, do we
catch their virtual table pointers?

	Jakub
  
Richard Biener Nov. 5, 2021, 10:31 a.m. UTC | #3
On Fri, Nov 5, 2021 at 10:54 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Fri, Nov 05, 2021 at 10:42:05AM +0100, Richard Biener via Gcc-patches wrote:
> > I had the impression we have support for PCH file relocation to deal with ASLR
> > at least on some platforms.
>
> Unfortunately we do not, e.g. if you build cc1/cc1plus as PIE on
> x86_64-linux, PCH will stop working unless one always invokes it with
> disabled ASLR through personality.
>
> I think this is related to function pointers and pointers to .rodata/.data
> etc. variables in GC memory, we currently do not relocate that.
>
> What we perhaps could do is (at least assuming all the ELF PT_LOAD segments
> are adjacent with a single load base for them - I think at least ia64
> non-PIE binaries were violating this by having .text and .data PT_LOAD
> segments many terrabytes appart with a whole in between not protected in any
> way, but dunno if that is for PIEs too), perhaps try in a host
> specific way remember the address range in which the function pointers and
> .rodata/.data can exist, remember the extent start and end from PCH generation
> and on PCH load query those addresses for the current compiler and relocate
> everything in that extent by the load bias from the last run.
> But, the assumption for this is that those function and data/rodata pointers
> in GC memory are actually marked at least as pointers...

If any such pointers exist they must be marked GTY((skip)) since they do not
point to GC memory...  So we'd need to invent special-handling for those.

> Do we e.g. have objects with virtual classes in GC memory and if so, do we
> catch their virtual table pointers?

Who knows, but then I don't remember adding stuff that should end in a PCH.

Honestly I don't think it's worth spending too much time in making this work.
Iff then disallow pointers to outside GC in PCH (maybe code abort() or
mark_invalid_pch calls into the pch walkers when they reach a GTY((skip)))

Richard.

>         Jakub
>
  
Jakub Jelinek Nov. 5, 2021, 3:25 p.m. UTC | #4
On Fri, Nov 05, 2021 at 11:31:58AM +0100, Richard Biener wrote:
> On Fri, Nov 5, 2021 at 10:54 AM Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > On Fri, Nov 05, 2021 at 10:42:05AM +0100, Richard Biener via Gcc-patches wrote:
> > > I had the impression we have support for PCH file relocation to deal with ASLR
> > > at least on some platforms.
> >
> > Unfortunately we do not, e.g. if you build cc1/cc1plus as PIE on
> > x86_64-linux, PCH will stop working unless one always invokes it with
> > disabled ASLR through personality.
> >
> > I think this is related to function pointers and pointers to .rodata/.data
> > etc. variables in GC memory, we currently do not relocate that.
> >
> > What we perhaps could do is (at least assuming all the ELF PT_LOAD segments
> > are adjacent with a single load base for them - I think at least ia64
> > non-PIE binaries were violating this by having .text and .data PT_LOAD
> > segments many terrabytes appart with a whole in between not protected in any
> > way, but dunno if that is for PIEs too), perhaps try in a host
> > specific way remember the address range in which the function pointers and
> > .rodata/.data can exist, remember the extent start and end from PCH generation
> > and on PCH load query those addresses for the current compiler and relocate
> > everything in that extent by the load bias from the last run.
> > But, the assumption for this is that those function and data/rodata pointers
> > in GC memory are actually marked at least as pointers...
> 
> If any such pointers exist they must be marked GTY((skip)) since they do not
> point to GC memory...  So we'd need to invent special-handling for those.
> 
> > Do we e.g. have objects with virtual classes in GC memory and if so, do we
> > catch their virtual table pointers?
> 
> Who knows, but then I don't remember adding stuff that should end in a PCH.

So, I've investigated a little bit.
Apparently all the relocation we currently do for PCH is done at PCH write
time, we choose some address range in the address space we think will be likely
mmappable each time successfully, relocate all pointers pointing to GC
memory to point in there and then write that to file, together with the
scalar GTY global vars values and GTY pointers in global vars.
On PCH load, we just try to mmap memory in the right range, fail PCH load if
unsuccessful, and read the GC memory into that range and update scalar and
pointer GTY global vars from what we've recorded.
Patch that made PCH load to fail for PIEs etc. was
https://gcc.gnu.org/legacy-ml/gcc-patches/2003-10/msg01994.html
If we wanted to relocate pointers to functions and .data/.rodata etc.,
ideally we'd create a relocation list of addresses that should be
incremented by the bias and quickly relocate those.

I wrote following ugly hack:

--- ggc-common.c.jj	2021-08-19 11:42:27.365422400 +0200
+++ ggc-common.c	2021-11-05 15:37:51.447222544 +0100
@@ -404,6 +404,9 @@ struct mmap_info
 
 /* Write out the state of the compiler to F.  */
 
+char *exestart = (char *) 2;
+char *exeend = (char *) 2;
+
 void
 gt_pch_save (FILE *f)
 {
@@ -458,6 +461,14 @@ gt_pch_save (FILE *f)
     for (rti = *rt; rti->base != NULL; rti++)
       if (fwrite (rti->base, rti->stride, 1, f) != 1)
 	fatal_error (input_location, "cannot write PCH file: %m");
+      else if ((((uintptr_t) rti->base) & (sizeof (void *) - 1)) == 0)
+        {
+          char *const *p = (char *const *) rti->base;
+          char *const *q = (char *const *) ((uintptr_t) rti->base + (rti->stride & ~(sizeof (void *) - 1)));
+          for (; p < q; p++)
+	    if (*p >= exestart && *p < exeend)
+	      fprintf (stderr, "scalar at %p points to executable %p\n", (void *) p, (void *) *p);
+        }
 
   /* Write out all the global pointers, after translation.  */
   write_pch_globals (gt_ggc_rtab, &state);
@@ -546,6 +557,15 @@ gt_pch_save (FILE *f)
       state.ptrs[i]->note_ptr_fn (state.ptrs[i]->obj,
 				  state.ptrs[i]->note_ptr_cookie,
 				  relocate_ptrs, &state);
+      if ((((uintptr_t) state.ptrs[i]->obj) & (sizeof (void *) - 1)) == 0)
+        {
+          char *const *p = (char *const *) (state.ptrs[i]->obj);
+          char *const *q = (char *const *) ((uintptr_t) (state.ptrs[i]->obj) + (state.ptrs[i]->size & ~(sizeof (void *) - 1)));
+          for (; p < q; p++)
+	    if (*p >= exestart && *p < exeend)
+	      fprintf (stderr, "object %p at %p points to executable %p\n", (void *) (state.ptrs[i]->obj), (void *) p, (void *) *p);
+        }
+
       ggc_pch_write_object (state.d, state.f, state.ptrs[i]->obj,
 			    state.ptrs[i]->new_addr, state.ptrs[i]->size,
 			    state.ptrs[i]->note_ptr_fn == gt_pch_p_S);

and under debugger set exestart and exeend from /proc/*/maps of the cc1plus
process being debugged (the extent of cc1plus mappings).
This resulted in something like:
scalar at 0x3d869a8 points to executable 0x2dd85e0
scalar at 0x3d869b0 points to executable 0x2dd85e4
scalar at 0x3d869c8 points to executable 0x2dd85e7
...
object 0x7fffea007e70 at 0x7fffea007e70 points to executable 0x11e48c2
object 0x7fffe953dcc0 at 0x7fffe953dcc0 points to executable 0x201e222
object 0x7fffe401d260 at 0x7fffe401d260 points to executable 0x4b0a27
object 0x7fffea02fce0 at 0x7fffea02fce0 points to executable 0x18bb2b0
object 0x7fffe7034ca0 at 0x7fffe7034ca0 points to executable 0x2f81537
object 0x7fffe700f8a0 at 0x7fffe700f8a0 points to executable 0x2c36a32
on stderr.  Unfortunately, I didn't try to rebuild the compiler as PIE, so
unfortunately the range was 0x400000 .. 0x3d9b000 so I'm not really sure
if all it dumped were actually addresses or some nice numbers like 0x1000000
etc.  Much better would be to have the compiler as PIE, run it twice and
only look at values that actually changed, or link the compiler at some very
unlikely virtual address offset so that addresses into it would be easy to
spot.
All the "scalar at " messages are for offsets in the ovl_op_info
array.
struct GTY(()) ovl_op_info_t {
  /* The IDENTIFIER_NODE for the operator.  */
  tree identifier;
  /* The name of the operator.  */
  const char *name;
  /* The mangled name of the operator.  */
  const char *mangled_name;
  /* The (regular) tree code.  */
  enum tree_code tree_code : 16;
  /* The (compressed) operator code.  */
  enum ovl_op_code ovl_op_code : 8;
  /* The ovl_op_flags of the operator */
  unsigned flags : 8;
};
For that particular case gengtype emits:
  {
    &ovl_op_info[0][0].identifier,
    1 * (2) * (OVL_OP_MAX),
    sizeof (ovl_op_info[0][0]),
    &gt_ggc_mx_tree_node,
    &gt_pch_nx_tree_node
  },
  {
    &ovl_op_info[0][0].name,
    1 * (2) * (OVL_OP_MAX),
    sizeof (ovl_op_info[0][0]),
    (gt_pointer_walker) &gt_ggc_m_S,
    (gt_pointer_walker) &gt_pch_n_S
  },
  {
    &ovl_op_info[0][0].mangled_name,
    1 * (2) * (OVL_OP_MAX),
    sizeof (ovl_op_info[0][0]),
    (gt_pointer_walker) &gt_ggc_m_S,
    (gt_pointer_walker) &gt_pch_n_S
  },
so I believe we treat the identifier as always a GC memory object pointer,
and name and mangled_name are const char * pointers which I vaguely remember
we allow to be either NULL, or 1 or GC memory pointers or string literals
(but can't find how it deals with that last category in the source).
From the source:
ovl_op_info_t ovl_op_info[2][OVL_OP_MAX] = 
  {
    {
      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
      {NULL_TREE, NULL, NULL, NOP_EXPR, OVL_OP_NOP_EXPR, 0},
#define DEF_OPERATOR(NAME, CODE, MANGLING, FLAGS) \
      {NULL_TREE, NAME, MANGLING, CODE, OVL_OP_##CODE, FLAGS},
#define OPERATOR_TRANSITION }, {                        \
      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
#include "operators.def"
    }
  };
where operators.def has e.g.:
DEF_OPERATOR ("new", NEW_EXPR, "nw", OVL_OP_FLAG_ALLOC)
in this particular array the strings are always string literals.
I guess to get ovl_op_info out of the picture we could mark
name and mangled_name as GTY((skip)).
But that is just 178 records, the remaining 52520 are in GC memory
objects.  Figuring out what exactly it is in will be harder...
From the addresses it printed in the last column, the following point
to the start of some cc1plus symbol:
  3310: 0000000000c121d2   831 FUNC    LOCAL  DEFAULT   14 _ZL9min_vis_rPP9tree_nodePiPv
134773: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 _Z20ggc_round_alloc_sizem
  6151: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 _Z20ggc_round_alloc_sizem
188594: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 _Z4is_aIP7gswitch6gimpleEbPT0_
 37908: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 _Z4is_aIP7gswitch6gimpleEbPT0_
 50655: 0000000001707c85    37 FUNC    LOCAL  DEFAULT   14 _ZL20realloc_for_line_mapPvm
131570: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 _ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
  1653: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 _ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
129108: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 _ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
 51650: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 _ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
 77141: 0000000001b6cb5a   159 FUNC    LOCAL  DEFAULT   14 _ZL10emit_localP9tree_nodePKcmm
 77142: 0000000001b6cbf9    75 FUNC    LOCAL  DEFAULT   14 _ZL8emit_bssP9tree_nodePKcmm
 77143: 0000000001b6cc44    75 FUNC    LOCAL  DEFAULT   14 _ZL11emit_commonP9tree_nodePKcmm
 77144: 0000000001b6cc8f   231 FUNC    LOCAL  DEFAULT   14 _ZL15emit_tls_commonP9tree_nodePKcmm
181390: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 _Z21output_section_asm_opPKv
 25347: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 _Z21output_section_asm_opPKv
160243: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC2E11combined_fn
163230: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC1E11combined_fn
 26343: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC1E11combined_fn
 40584: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC2E11combined_fn
 12547: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
165150: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
181147: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
 26558: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
  8400: 0000000002e13f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_1
  8448: 0000000002e14260    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
 10166: 0000000002e444a0     4 OBJECT  LOCAL  DEFAULT   16 _ZN15zero_regs_flagsL11ALL_GPR_ARGE
 11568: 0000000002e51420    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512FP16
 11735: 0000000002e52f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_VPCLMULQDQ
 12575: 0000000002e5f560    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_ROCKETLAKE
165019: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 class_narrowest_mode
  9991: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 class_narrowest_mode
 12749: 0000000002e60f60   160 OBJECT  LOCAL  DEFAULT   16 _ZL22extra_order_size_table
 14715: 0000000002e7e340    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SKYLAKE_AVX512
 15895: 0000000002e84480    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_UINTR
 17084: 0000000002e8c160    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SAPPHIRERAPIDS
 18397: 0000000002e946a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_2
 18986: 0000000002e97580    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
 18990: 0000000002e975c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL17PTA_GOLDMONT_PLUS
 22195: 0000000002eb1640    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_FMA
 30065: 0000000002eed6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512BF16
 31474: 0000000002ef3560     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
 34906: 0000000002f02580    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512BW
 37696: 0000000002f0e420     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
 37701: 0000000002f0e484     4 OBJECT  LOCAL  DEFAULT   16 _ZL40LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES
 38868: 0000000002f13420    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
 39129: 0000000002f143c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_XSAVES
 40610: 0000000002f1e7c0     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
 42157: 0000000002f293c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL16PTA_AVX5124VNNIW
 42201: 0000000002f29680    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SKYLAKE_AVX512
 42207: 0000000002f296e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_ICELAKE_SERVER
 49618: 0000000002f556e0     4 OBJECT  LOCAL  DEFAULT   16 _ZN15zero_regs_flagsL8USED_ARGE
 50904: 0000000002f5d4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_AVX
 51188: 0000000002f5e6e0    48 OBJECT  LOCAL  DEFAULT   16 _ZN12_GLOBAL__N_1L17pass_data_tm_initE
 56440: 0000000002f7d440    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
 57404: 0000000002f81640     4 OBJECT  LOCAL  DEFAULT   16 _ZL14MAX_LOCATION_T
 57424: 0000000002f816a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_64BIT
 60100: 0000000002f903a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_TILE
 67672: 0000000002fae460    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_COOPERLAKE
 68780: 0000000002fb37c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512CD
 70316: 0000000002fbb4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_LWP
 70637: 0000000002fbc7a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
 70837: 0000000002fbd4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
 73878: 0000000002fcb960    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
 79867: 00000000030435c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
 81991: 0000000003053520    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_F16C
 82244: 0000000003054500    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
 86070: 00000000033ec560    99 OBJECT  LOCAL  DEFAULT   16 _ZL26znver1_agu_min_issue_delay
 86071: 00000000033ec5e0  1334 OBJECT  LOCAL  DEFAULT   16 _ZL15geode_translate
 86228: 00000000034419c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_NO_80387
 94224: 0000000003849420    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
 94230: 0000000003849480    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_XOP
 94647: 000000000384aa40    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SGX
 95488: 000000000384e4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNM
 95820: 000000000384f6a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
 95822: 000000000384f6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
 95824: 000000000384f6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_WAITPKG
 96072: 0000000003850640    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_LZCNT
 96074: 0000000003850660    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_MOVBE
 96080: 00000000038506c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SSE
 98344: 000000000385a4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_ENQCMD
 99309: 000000000385da40    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_POPCNT
103332: 000000000386f2c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CLFLUSHOPT
103344: 000000000386f380    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_PKU
103352: 000000000386f400    16 OBJECT  LOCAL  DEFAULT   16 _ZL15PTA_AVX512VBMI2
103709: 0000000003870a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
104337: 0000000003873660    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
106315: 000000000387d260    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_NO_TUNE
109183: 000000000388c160    16 OBJECT  LOCAL  DEFAULT   16 _ZL16PTA_AVX5124VNNIW
111159: 0000000003894a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CANNONLAKE
112043: 00000000038994c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
112049: 0000000003899520    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
113040: 000000000389d6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_BF16
 21876: 0000000003d8d5c0    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names
 31109: 0000000003d8e100    40 OBJECT  LOCAL  DEFAULT   28 _ZL30unspecified_modref_access_node
 78193: 0000000003d932e0    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names
 78366: 0000000003d93320    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names

	Jakub
  
Iain Sandoe Nov. 5, 2021, 4:37 p.m. UTC | #5
> On 5 Nov 2021, at 15:25, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Fri, Nov 05, 2021 at 11:31:58AM +0100, Richard Biener wrote:
>> On Fri, Nov 5, 2021 at 10:54 AM Jakub Jelinek <jakub@redhat.com> wrote:
>>> 
>>> On Fri, Nov 05, 2021 at 10:42:05AM +0100, Richard Biener via Gcc-patches wrote:
>>>> I had the impression we have support for PCH file relocation to deal with ASLR
>>>> at least on some platforms.
>>> 
>>> Unfortunately we do not, e.g. if you build cc1/cc1plus as PIE on
>>> x86_64-linux, PCH will stop working unless one always invokes it with
>>> disabled ASLR through personality.
>>> 
>>> I think this is related to function pointers and pointers to .rodata/.data
>>> etc. variables in GC memory, we currently do not relocate that.
>>> 
>>> What we perhaps could do is (at least assuming all the ELF PT_LOAD segments
>>> are adjacent with a single load base for them - I think at least ia64
>>> non-PIE binaries were violating this by having .text and .data PT_LOAD
>>> segments many terrabytes appart with a whole in between not protected in any
>>> way, but dunno if that is for PIEs too), perhaps try in a host
>>> specific way remember the address range in which the function pointers and
>>> .rodata/.data can exist, remember the extent start and end from PCH generation
>>> and on PCH load query those addresses for the current compiler and relocate
>>> everything in that extent by the load bias from the last run.
>>> But, the assumption for this is that those function and data/rodata pointers
>>> in GC memory are actually marked at least as pointers...
>> 
>> If any such pointers exist they must be marked GTY((skip)) since they do not
>> point to GC memory...  So we'd need to invent special-handling for those.
>> 
>>> Do we e.g. have objects with virtual classes in GC memory and if so, do we
>>> catch their virtual table pointers?
>> 
>> Who knows, but then I don't remember adding stuff that should end in a PCH.
> 
> So, I've investigated a little bit.
> Apparently all the relocation we currently do for PCH is done at PCH write
> time, we choose some address range in the address space we think will be likely
> mmappable each time successfully, relocate all pointers pointing to GC
> memory to point in there and then write that to file, together with the
> scalar GTY global vars values and GTY pointers in global vars.
> On PCH load, we just try to mmap memory in the right range, fail PCH load if
> unsuccessful, and read the GC memory into that range and update scalar and
> pointer GTY global vars from what we've recorded.
> Patch that made PCH load to fail for PIEs etc. was
> https://gcc.gnu.org/legacy-ml/gcc-patches/2003-10/msg01994.html
> If we wanted to relocate pointers to functions and .data/.rodata etc.,
> ideally we'd create a relocation list of addresses that should be
> incremented by the bias and quickly relocate those.

It is hard to judge the relative effort in the two immediately visible solutions:

1. relocatable PCH
2. taking the tree streamer from the modules implementation, moving its home
    to c-family and adding hooks so that each FE can stream its own special trees.

ISTM, that part of the reason people dislike PCH is because the implementation is
mixed up with the GC solution - the rendering is non-transparent etc.

So, in some ways, (2) above would be a better investment - the process of PCH is:
generate:
“get to the end of parsing a TU” .. stream the AST
consume:
.. see a header .. stream the PCH AST in if there is one available for the header.

There is no reason for this to be mixed into the GC solution - the read in (currently)
happens to an empty TU and there should be nothing in the AST that carries any
reference to the compiler’s executable.

just 0.02 GBP.
Iain


> 
> I wrote following ugly hack:
> 
> --- ggc-common.c.jj	2021-08-19 11:42:27.365422400 +0200
> +++ ggc-common.c	2021-11-05 15:37:51.447222544 +0100
> @@ -404,6 +404,9 @@ struct mmap_info
> 
> /* Write out the state of the compiler to F.  */
> 
> +char *exestart = (char *) 2;
> +char *exeend = (char *) 2;
> +
> void
> gt_pch_save (FILE *f)
> {
> @@ -458,6 +461,14 @@ gt_pch_save (FILE *f)
>     for (rti = *rt; rti->base != NULL; rti++)
>       if (fwrite (rti->base, rti->stride, 1, f) != 1)
> 	fatal_error (input_location, "cannot write PCH file: %m");
> +      else if ((((uintptr_t) rti->base) & (sizeof (void *) - 1)) == 0)
> +        {
> +          char *const *p = (char *const *) rti->base;
> +          char *const *q = (char *const *) ((uintptr_t) rti->base + (rti->stride & ~(sizeof (void *) - 1)));
> +          for (; p < q; p++)
> +	    if (*p >= exestart && *p < exeend)
> +	      fprintf (stderr, "scalar at %p points to executable %p\n", (void *) p, (void *) *p);
> +        }
> 
>   /* Write out all the global pointers, after translation.  */
>   write_pch_globals (gt_ggc_rtab, &state);
> @@ -546,6 +557,15 @@ gt_pch_save (FILE *f)
>       state.ptrs[i]->note_ptr_fn (state.ptrs[i]->obj,
> 				  state.ptrs[i]->note_ptr_cookie,
> 				  relocate_ptrs, &state);
> +      if ((((uintptr_t) state.ptrs[i]->obj) & (sizeof (void *) - 1)) == 0)
> +        {
> +          char *const *p = (char *const *) (state.ptrs[i]->obj);
> +          char *const *q = (char *const *) ((uintptr_t) (state.ptrs[i]->obj) + (state.ptrs[i]->size & ~(sizeof (void *) - 1)));
> +          for (; p < q; p++)
> +	    if (*p >= exestart && *p < exeend)
> +	      fprintf (stderr, "object %p at %p points to executable %p\n", (void *) (state.ptrs[i]->obj), (void *) p, (void *) *p);
> +        }
> +
>       ggc_pch_write_object (state.d, state.f, state.ptrs[i]->obj,
> 			    state.ptrs[i]->new_addr, state.ptrs[i]->size,
> 			    state.ptrs[i]->note_ptr_fn == gt_pch_p_S);
> 
> and under debugger set exestart and exeend from /proc/*/maps of the cc1plus
> process being debugged (the extent of cc1plus mappings).
> This resulted in something like:
> scalar at 0x3d869a8 points to executable 0x2dd85e0
> scalar at 0x3d869b0 points to executable 0x2dd85e4
> scalar at 0x3d869c8 points to executable 0x2dd85e7
> ...
> object 0x7fffea007e70 at 0x7fffea007e70 points to executable 0x11e48c2
> object 0x7fffe953dcc0 at 0x7fffe953dcc0 points to executable 0x201e222
> object 0x7fffe401d260 at 0x7fffe401d260 points to executable 0x4b0a27
> object 0x7fffea02fce0 at 0x7fffea02fce0 points to executable 0x18bb2b0
> object 0x7fffe7034ca0 at 0x7fffe7034ca0 points to executable 0x2f81537
> object 0x7fffe700f8a0 at 0x7fffe700f8a0 points to executable 0x2c36a32
> on stderr.  Unfortunately, I didn't try to rebuild the compiler as PIE, so
> unfortunately the range was 0x400000 .. 0x3d9b000 so I'm not really sure
> if all it dumped were actually addresses or some nice numbers like 0x1000000
> etc.  Much better would be to have the compiler as PIE, run it twice and
> only look at values that actually changed, or link the compiler at some very
> unlikely virtual address offset so that addresses into it would be easy to
> spot.
> All the "scalar at " messages are for offsets in the ovl_op_info
> array.
> struct GTY(()) ovl_op_info_t {
>  /* The IDENTIFIER_NODE for the operator.  */
>  tree identifier;
>  /* The name of the operator.  */
>  const char *name;
>  /* The mangled name of the operator.  */
>  const char *mangled_name;
>  /* The (regular) tree code.  */
>  enum tree_code tree_code : 16;
>  /* The (compressed) operator code.  */
>  enum ovl_op_code ovl_op_code : 8;
>  /* The ovl_op_flags of the operator */
>  unsigned flags : 8;
> };
> For that particular case gengtype emits:
>  {
>    &ovl_op_info[0][0].identifier,
>    1 * (2) * (OVL_OP_MAX),
>    sizeof (ovl_op_info[0][0]),
>    &gt_ggc_mx_tree_node,
>    &gt_pch_nx_tree_node
>  },
>  {
>    &ovl_op_info[0][0].name,
>    1 * (2) * (OVL_OP_MAX),
>    sizeof (ovl_op_info[0][0]),
>    (gt_pointer_walker) &gt_ggc_m_S,
>    (gt_pointer_walker) &gt_pch_n_S
>  },
>  {
>    &ovl_op_info[0][0].mangled_name,
>    1 * (2) * (OVL_OP_MAX),
>    sizeof (ovl_op_info[0][0]),
>    (gt_pointer_walker) &gt_ggc_m_S,
>    (gt_pointer_walker) &gt_pch_n_S
>  },
> so I believe we treat the identifier as always a GC memory object pointer,
> and name and mangled_name are const char * pointers which I vaguely remember
> we allow to be either NULL, or 1 or GC memory pointers or string literals
> (but can't find how it deals with that last category in the source).
> From the source:
> ovl_op_info_t ovl_op_info[2][OVL_OP_MAX] = 
>  {
>    {
>      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
>      {NULL_TREE, NULL, NULL, NOP_EXPR, OVL_OP_NOP_EXPR, 0},
> #define DEF_OPERATOR(NAME, CODE, MANGLING, FLAGS) \
>      {NULL_TREE, NAME, MANGLING, CODE, OVL_OP_##CODE, FLAGS},
> #define OPERATOR_TRANSITION }, {                        \
>      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
> #include "operators.def"
>    }
>  };
> where operators.def has e.g.:
> DEF_OPERATOR ("new", NEW_EXPR, "nw", OVL_OP_FLAG_ALLOC)
> in this particular array the strings are always string literals.
> I guess to get ovl_op_info out of the picture we could mark
> name and mangled_name as GTY((skip)).
> But that is just 178 records, the remaining 52520 are in GC memory
> objects.  Figuring out what exactly it is in will be harder...
> From the addresses it printed in the last column, the following point
> to the start of some cc1plus symbol:
>  3310: 0000000000c121d2   831 FUNC    LOCAL  DEFAULT   14 _ZL9min_vis_rPP9tree_nodePiPv
> 134773: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 _Z20ggc_round_alloc_sizem
>  6151: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 _Z20ggc_round_alloc_sizem
> 188594: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 _Z4is_aIP7gswitch6gimpleEbPT0_
> 37908: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 _Z4is_aIP7gswitch6gimpleEbPT0_
> 50655: 0000000001707c85    37 FUNC    LOCAL  DEFAULT   14 _ZL20realloc_for_line_mapPvm
> 131570: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 _ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
>  1653: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 _ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
> 129108: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 _ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
> 51650: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 _ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
> 77141: 0000000001b6cb5a   159 FUNC    LOCAL  DEFAULT   14 _ZL10emit_localP9tree_nodePKcmm
> 77142: 0000000001b6cbf9    75 FUNC    LOCAL  DEFAULT   14 _ZL8emit_bssP9tree_nodePKcmm
> 77143: 0000000001b6cc44    75 FUNC    LOCAL  DEFAULT   14 _ZL11emit_commonP9tree_nodePKcmm
> 77144: 0000000001b6cc8f   231 FUNC    LOCAL  DEFAULT   14 _ZL15emit_tls_commonP9tree_nodePKcmm
> 181390: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 _Z21output_section_asm_opPKv
> 25347: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 _Z21output_section_asm_opPKv
> 160243: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC2E11combined_fn
> 163230: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC1E11combined_fn
> 26343: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC1E11combined_fn
> 40584: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC2E11combined_fn
> 12547: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
> 165150: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
> 181147: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
> 26558: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
>  8400: 0000000002e13f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_1
>  8448: 0000000002e14260    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
> 10166: 0000000002e444a0     4 OBJECT  LOCAL  DEFAULT   16 _ZN15zero_regs_flagsL11ALL_GPR_ARGE
> 11568: 0000000002e51420    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512FP16
> 11735: 0000000002e52f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_VPCLMULQDQ
> 12575: 0000000002e5f560    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_ROCKETLAKE
> 165019: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 class_narrowest_mode
>  9991: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 class_narrowest_mode
> 12749: 0000000002e60f60   160 OBJECT  LOCAL  DEFAULT   16 _ZL22extra_order_size_table
> 14715: 0000000002e7e340    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SKYLAKE_AVX512
> 15895: 0000000002e84480    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_UINTR
> 17084: 0000000002e8c160    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SAPPHIRERAPIDS
> 18397: 0000000002e946a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_2
> 18986: 0000000002e97580    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
> 18990: 0000000002e975c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL17PTA_GOLDMONT_PLUS
> 22195: 0000000002eb1640    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_FMA
> 30065: 0000000002eed6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512BF16
> 31474: 0000000002ef3560     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
> 34906: 0000000002f02580    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512BW
> 37696: 0000000002f0e420     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
> 37701: 0000000002f0e484     4 OBJECT  LOCAL  DEFAULT   16 _ZL40LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES
> 38868: 0000000002f13420    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
> 39129: 0000000002f143c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_XSAVES
> 40610: 0000000002f1e7c0     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
> 42157: 0000000002f293c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL16PTA_AVX5124VNNIW
> 42201: 0000000002f29680    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SKYLAKE_AVX512
> 42207: 0000000002f296e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_ICELAKE_SERVER
> 49618: 0000000002f556e0     4 OBJECT  LOCAL  DEFAULT   16 _ZN15zero_regs_flagsL8USED_ARGE
> 50904: 0000000002f5d4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_AVX
> 51188: 0000000002f5e6e0    48 OBJECT  LOCAL  DEFAULT   16 _ZN12_GLOBAL__N_1L17pass_data_tm_initE
> 56440: 0000000002f7d440    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
> 57404: 0000000002f81640     4 OBJECT  LOCAL  DEFAULT   16 _ZL14MAX_LOCATION_T
> 57424: 0000000002f816a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_64BIT
> 60100: 0000000002f903a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_TILE
> 67672: 0000000002fae460    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_COOPERLAKE
> 68780: 0000000002fb37c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512CD
> 70316: 0000000002fbb4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_LWP
> 70637: 0000000002fbc7a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
> 70837: 0000000002fbd4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
> 73878: 0000000002fcb960    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
> 79867: 00000000030435c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
> 81991: 0000000003053520    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_F16C
> 82244: 0000000003054500    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
> 86070: 00000000033ec560    99 OBJECT  LOCAL  DEFAULT   16 _ZL26znver1_agu_min_issue_delay
> 86071: 00000000033ec5e0  1334 OBJECT  LOCAL  DEFAULT   16 _ZL15geode_translate
> 86228: 00000000034419c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_NO_80387
> 94224: 0000000003849420    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
> 94230: 0000000003849480    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_XOP
> 94647: 000000000384aa40    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SGX
> 95488: 000000000384e4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNM
> 95820: 000000000384f6a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
> 95822: 000000000384f6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
> 95824: 000000000384f6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_WAITPKG
> 96072: 0000000003850640    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_LZCNT
> 96074: 0000000003850660    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_MOVBE
> 96080: 00000000038506c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SSE
> 98344: 000000000385a4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_ENQCMD
> 99309: 000000000385da40    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_POPCNT
> 103332: 000000000386f2c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CLFLUSHOPT
> 103344: 000000000386f380    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_PKU
> 103352: 000000000386f400    16 OBJECT  LOCAL  DEFAULT   16 _ZL15PTA_AVX512VBMI2
> 103709: 0000000003870a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
> 104337: 0000000003873660    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
> 106315: 000000000387d260    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_NO_TUNE
> 109183: 000000000388c160    16 OBJECT  LOCAL  DEFAULT   16 _ZL16PTA_AVX5124VNNIW
> 111159: 0000000003894a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CANNONLAKE
> 112043: 00000000038994c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
> 112049: 0000000003899520    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
> 113040: 000000000389d6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_BF16
> 21876: 0000000003d8d5c0    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names
> 31109: 0000000003d8e100    40 OBJECT  LOCAL  DEFAULT   28 _ZL30unspecified_modref_access_node
> 78193: 0000000003d932e0    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names
> 78366: 0000000003d93320    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names
> 
> 	Jakub
  
Richard Biener Nov. 8, 2021, 7:16 a.m. UTC | #6
On Fri, Nov 5, 2021 at 5:37 PM Iain Sandoe <iain@sandoe.co.uk> wrote:
>
>
>
> > On 5 Nov 2021, at 15:25, Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > On Fri, Nov 05, 2021 at 11:31:58AM +0100, Richard Biener wrote:
> >> On Fri, Nov 5, 2021 at 10:54 AM Jakub Jelinek <jakub@redhat.com> wrote:
> >>>
> >>> On Fri, Nov 05, 2021 at 10:42:05AM +0100, Richard Biener via Gcc-patches wrote:
> >>>> I had the impression we have support for PCH file relocation to deal with ASLR
> >>>> at least on some platforms.
> >>>
> >>> Unfortunately we do not, e.g. if you build cc1/cc1plus as PIE on
> >>> x86_64-linux, PCH will stop working unless one always invokes it with
> >>> disabled ASLR through personality.
> >>>
> >>> I think this is related to function pointers and pointers to .rodata/.data
> >>> etc. variables in GC memory, we currently do not relocate that.
> >>>
> >>> What we perhaps could do is (at least assuming all the ELF PT_LOAD segments
> >>> are adjacent with a single load base for them - I think at least ia64
> >>> non-PIE binaries were violating this by having .text and .data PT_LOAD
> >>> segments many terrabytes appart with a whole in between not protected in any
> >>> way, but dunno if that is for PIEs too), perhaps try in a host
> >>> specific way remember the address range in which the function pointers and
> >>> .rodata/.data can exist, remember the extent start and end from PCH generation
> >>> and on PCH load query those addresses for the current compiler and relocate
> >>> everything in that extent by the load bias from the last run.
> >>> But, the assumption for this is that those function and data/rodata pointers
> >>> in GC memory are actually marked at least as pointers...
> >>
> >> If any such pointers exist they must be marked GTY((skip)) since they do not
> >> point to GC memory...  So we'd need to invent special-handling for those.
> >>
> >>> Do we e.g. have objects with virtual classes in GC memory and if so, do we
> >>> catch their virtual table pointers?
> >>
> >> Who knows, but then I don't remember adding stuff that should end in a PCH.
> >
> > So, I've investigated a little bit.
> > Apparently all the relocation we currently do for PCH is done at PCH write
> > time, we choose some address range in the address space we think will be likely
> > mmappable each time successfully, relocate all pointers pointing to GC
> > memory to point in there and then write that to file, together with the
> > scalar GTY global vars values and GTY pointers in global vars.
> > On PCH load, we just try to mmap memory in the right range, fail PCH load if
> > unsuccessful, and read the GC memory into that range and update scalar and
> > pointer GTY global vars from what we've recorded.
> > Patch that made PCH load to fail for PIEs etc. was
> > https://gcc.gnu.org/legacy-ml/gcc-patches/2003-10/msg01994.html
> > If we wanted to relocate pointers to functions and .data/.rodata etc.,
> > ideally we'd create a relocation list of addresses that should be
> > incremented by the bias and quickly relocate those.
>
> It is hard to judge the relative effort in the two immediately visible solutions:
>
> 1. relocatable PCH
> 2. taking the tree streamer from the modules implementation, moving its home
>     to c-family and adding hooks so that each FE can stream its own special trees.
>
> ISTM, that part of the reason people dislike PCH is because the implementation is
> mixed up with the GC solution - the rendering is non-transparent etc.

Yes.  In particular it stands in the way of even thinking of doing sth
different than
GC for trees.

> So, in some ways, (2) above would be a better investment - the process of PCH is:
> generate:
> “get to the end of parsing a TU” .. stream the AST
> consume:
> .. see a header .. stream the PCH AST in if there is one available for the header.
>
> There is no reason for this to be mixed into the GC solution - the read in (currently)
> happens to an empty TU and there should be nothing in the AST that carries any
> reference to the compiler’s executable.

It makes the PCH read-in "cheap" - IIRC Google invested quite some work in
evaluating different PC* approaches but none in the end made a big enough
difference.  Given we now have a standards backed PCH-like thing for C++
with modules and given that for C (besides on Darwin...) PCH never made much
sense I doubt investing into generalizing the C++ module support or making
PCH relocatable is worth the trouble.

Richard.

>
> just 0.02 GBP.
> Iain
>
>
> >
> > I wrote following ugly hack:
> >
> > --- ggc-common.c.jj   2021-08-19 11:42:27.365422400 +0200
> > +++ ggc-common.c      2021-11-05 15:37:51.447222544 +0100
> > @@ -404,6 +404,9 @@ struct mmap_info
> >
> > /* Write out the state of the compiler to F.  */
> >
> > +char *exestart = (char *) 2;
> > +char *exeend = (char *) 2;
> > +
> > void
> > gt_pch_save (FILE *f)
> > {
> > @@ -458,6 +461,14 @@ gt_pch_save (FILE *f)
> >     for (rti = *rt; rti->base != NULL; rti++)
> >       if (fwrite (rti->base, rti->stride, 1, f) != 1)
> >       fatal_error (input_location, "cannot write PCH file: %m");
> > +      else if ((((uintptr_t) rti->base) & (sizeof (void *) - 1)) == 0)
> > +        {
> > +          char *const *p = (char *const *) rti->base;
> > +          char *const *q = (char *const *) ((uintptr_t) rti->base + (rti->stride & ~(sizeof (void *) - 1)));
> > +          for (; p < q; p++)
> > +         if (*p >= exestart && *p < exeend)
> > +           fprintf (stderr, "scalar at %p points to executable %p\n", (void *) p, (void *) *p);
> > +        }
> >
> >   /* Write out all the global pointers, after translation.  */
> >   write_pch_globals (gt_ggc_rtab, &state);
> > @@ -546,6 +557,15 @@ gt_pch_save (FILE *f)
> >       state.ptrs[i]->note_ptr_fn (state.ptrs[i]->obj,
> >                                 state.ptrs[i]->note_ptr_cookie,
> >                                 relocate_ptrs, &state);
> > +      if ((((uintptr_t) state.ptrs[i]->obj) & (sizeof (void *) - 1)) == 0)
> > +        {
> > +          char *const *p = (char *const *) (state.ptrs[i]->obj);
> > +          char *const *q = (char *const *) ((uintptr_t) (state.ptrs[i]->obj) + (state.ptrs[i]->size & ~(sizeof (void *) - 1)));
> > +          for (; p < q; p++)
> > +         if (*p >= exestart && *p < exeend)
> > +           fprintf (stderr, "object %p at %p points to executable %p\n", (void *) (state.ptrs[i]->obj), (void *) p, (void *) *p);
> > +        }
> > +
> >       ggc_pch_write_object (state.d, state.f, state.ptrs[i]->obj,
> >                           state.ptrs[i]->new_addr, state.ptrs[i]->size,
> >                           state.ptrs[i]->note_ptr_fn == gt_pch_p_S);
> >
> > and under debugger set exestart and exeend from /proc/*/maps of the cc1plus
> > process being debugged (the extent of cc1plus mappings).
> > This resulted in something like:
> > scalar at 0x3d869a8 points to executable 0x2dd85e0
> > scalar at 0x3d869b0 points to executable 0x2dd85e4
> > scalar at 0x3d869c8 points to executable 0x2dd85e7
> > ...
> > object 0x7fffea007e70 at 0x7fffea007e70 points to executable 0x11e48c2
> > object 0x7fffe953dcc0 at 0x7fffe953dcc0 points to executable 0x201e222
> > object 0x7fffe401d260 at 0x7fffe401d260 points to executable 0x4b0a27
> > object 0x7fffea02fce0 at 0x7fffea02fce0 points to executable 0x18bb2b0
> > object 0x7fffe7034ca0 at 0x7fffe7034ca0 points to executable 0x2f81537
> > object 0x7fffe700f8a0 at 0x7fffe700f8a0 points to executable 0x2c36a32
> > on stderr.  Unfortunately, I didn't try to rebuild the compiler as PIE, so
> > unfortunately the range was 0x400000 .. 0x3d9b000 so I'm not really sure
> > if all it dumped were actually addresses or some nice numbers like 0x1000000
> > etc.  Much better would be to have the compiler as PIE, run it twice and
> > only look at values that actually changed, or link the compiler at some very
> > unlikely virtual address offset so that addresses into it would be easy to
> > spot.
> > All the "scalar at " messages are for offsets in the ovl_op_info
> > array.
> > struct GTY(()) ovl_op_info_t {
> >  /* The IDENTIFIER_NODE for the operator.  */
> >  tree identifier;
> >  /* The name of the operator.  */
> >  const char *name;
> >  /* The mangled name of the operator.  */
> >  const char *mangled_name;
> >  /* The (regular) tree code.  */
> >  enum tree_code tree_code : 16;
> >  /* The (compressed) operator code.  */
> >  enum ovl_op_code ovl_op_code : 8;
> >  /* The ovl_op_flags of the operator */
> >  unsigned flags : 8;
> > };
> > For that particular case gengtype emits:
> >  {
> >    &ovl_op_info[0][0].identifier,
> >    1 * (2) * (OVL_OP_MAX),
> >    sizeof (ovl_op_info[0][0]),
> >    &gt_ggc_mx_tree_node,
> >    &gt_pch_nx_tree_node
> >  },
> >  {
> >    &ovl_op_info[0][0].name,
> >    1 * (2) * (OVL_OP_MAX),
> >    sizeof (ovl_op_info[0][0]),
> >    (gt_pointer_walker) &gt_ggc_m_S,
> >    (gt_pointer_walker) &gt_pch_n_S
> >  },
> >  {
> >    &ovl_op_info[0][0].mangled_name,
> >    1 * (2) * (OVL_OP_MAX),
> >    sizeof (ovl_op_info[0][0]),
> >    (gt_pointer_walker) &gt_ggc_m_S,
> >    (gt_pointer_walker) &gt_pch_n_S
> >  },
> > so I believe we treat the identifier as always a GC memory object pointer,
> > and name and mangled_name are const char * pointers which I vaguely remember
> > we allow to be either NULL, or 1 or GC memory pointers or string literals
> > (but can't find how it deals with that last category in the source).
> > From the source:
> > ovl_op_info_t ovl_op_info[2][OVL_OP_MAX] =
> >  {
> >    {
> >      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
> >      {NULL_TREE, NULL, NULL, NOP_EXPR, OVL_OP_NOP_EXPR, 0},
> > #define DEF_OPERATOR(NAME, CODE, MANGLING, FLAGS) \
> >      {NULL_TREE, NAME, MANGLING, CODE, OVL_OP_##CODE, FLAGS},
> > #define OPERATOR_TRANSITION }, {                        \
> >      {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
> > #include "operators.def"
> >    }
> >  };
> > where operators.def has e.g.:
> > DEF_OPERATOR ("new", NEW_EXPR, "nw", OVL_OP_FLAG_ALLOC)
> > in this particular array the strings are always string literals.
> > I guess to get ovl_op_info out of the picture we could mark
> > name and mangled_name as GTY((skip)).
> > But that is just 178 records, the remaining 52520 are in GC memory
> > objects.  Figuring out what exactly it is in will be harder...
> > From the addresses it printed in the last column, the following point
> > to the start of some cc1plus symbol:
> >  3310: 0000000000c121d2   831 FUNC    LOCAL  DEFAULT   14 _ZL9min_vis_rPP9tree_nodePiPv
> > 134773: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 _Z20ggc_round_alloc_sizem
> >  6151: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 _Z20ggc_round_alloc_sizem
> > 188594: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 _Z4is_aIP7gswitch6gimpleEbPT0_
> > 37908: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 _Z4is_aIP7gswitch6gimpleEbPT0_
> > 50655: 0000000001707c85    37 FUNC    LOCAL  DEFAULT   14 _ZL20realloc_for_line_mapPvm
> > 131570: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 _ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
> >  1653: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 _ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
> > 129108: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 _ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
> > 51650: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 _ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
> > 77141: 0000000001b6cb5a   159 FUNC    LOCAL  DEFAULT   14 _ZL10emit_localP9tree_nodePKcmm
> > 77142: 0000000001b6cbf9    75 FUNC    LOCAL  DEFAULT   14 _ZL8emit_bssP9tree_nodePKcmm
> > 77143: 0000000001b6cc44    75 FUNC    LOCAL  DEFAULT   14 _ZL11emit_commonP9tree_nodePKcmm
> > 77144: 0000000001b6cc8f   231 FUNC    LOCAL  DEFAULT   14 _ZL15emit_tls_commonP9tree_nodePKcmm
> > 181390: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 _Z21output_section_asm_opPKv
> > 25347: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 _Z21output_section_asm_opPKv
> > 160243: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC2E11combined_fn
> > 163230: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC1E11combined_fn
> > 26343: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC1E11combined_fn
> > 40584: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC2E11combined_fn
> > 12547: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
> > 165150: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
> > 181147: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
> > 26558: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
> >  8400: 0000000002e13f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_1
> >  8448: 0000000002e14260    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
> > 10166: 0000000002e444a0     4 OBJECT  LOCAL  DEFAULT   16 _ZN15zero_regs_flagsL11ALL_GPR_ARGE
> > 11568: 0000000002e51420    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512FP16
> > 11735: 0000000002e52f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_VPCLMULQDQ
> > 12575: 0000000002e5f560    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_ROCKETLAKE
> > 165019: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 class_narrowest_mode
> >  9991: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 class_narrowest_mode
> > 12749: 0000000002e60f60   160 OBJECT  LOCAL  DEFAULT   16 _ZL22extra_order_size_table
> > 14715: 0000000002e7e340    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SKYLAKE_AVX512
> > 15895: 0000000002e84480    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_UINTR
> > 17084: 0000000002e8c160    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SAPPHIRERAPIDS
> > 18397: 0000000002e946a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_2
> > 18986: 0000000002e97580    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
> > 18990: 0000000002e975c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL17PTA_GOLDMONT_PLUS
> > 22195: 0000000002eb1640    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_FMA
> > 30065: 0000000002eed6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512BF16
> > 31474: 0000000002ef3560     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
> > 34906: 0000000002f02580    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512BW
> > 37696: 0000000002f0e420     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
> > 37701: 0000000002f0e484     4 OBJECT  LOCAL  DEFAULT   16 _ZL40LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES
> > 38868: 0000000002f13420    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
> > 39129: 0000000002f143c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_XSAVES
> > 40610: 0000000002f1e7c0     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
> > 42157: 0000000002f293c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL16PTA_AVX5124VNNIW
> > 42201: 0000000002f29680    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SKYLAKE_AVX512
> > 42207: 0000000002f296e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_ICELAKE_SERVER
> > 49618: 0000000002f556e0     4 OBJECT  LOCAL  DEFAULT   16 _ZN15zero_regs_flagsL8USED_ARGE
> > 50904: 0000000002f5d4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_AVX
> > 51188: 0000000002f5e6e0    48 OBJECT  LOCAL  DEFAULT   16 _ZN12_GLOBAL__N_1L17pass_data_tm_initE
> > 56440: 0000000002f7d440    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
> > 57404: 0000000002f81640     4 OBJECT  LOCAL  DEFAULT   16 _ZL14MAX_LOCATION_T
> > 57424: 0000000002f816a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_64BIT
> > 60100: 0000000002f903a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_TILE
> > 67672: 0000000002fae460    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_COOPERLAKE
> > 68780: 0000000002fb37c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512CD
> > 70316: 0000000002fbb4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_LWP
> > 70637: 0000000002fbc7a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
> > 70837: 0000000002fbd4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
> > 73878: 0000000002fcb960    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
> > 79867: 00000000030435c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
> > 81991: 0000000003053520    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_F16C
> > 82244: 0000000003054500    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
> > 86070: 00000000033ec560    99 OBJECT  LOCAL  DEFAULT   16 _ZL26znver1_agu_min_issue_delay
> > 86071: 00000000033ec5e0  1334 OBJECT  LOCAL  DEFAULT   16 _ZL15geode_translate
> > 86228: 00000000034419c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_NO_80387
> > 94224: 0000000003849420    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
> > 94230: 0000000003849480    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_XOP
> > 94647: 000000000384aa40    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SGX
> > 95488: 000000000384e4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNM
> > 95820: 000000000384f6a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
> > 95822: 000000000384f6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
> > 95824: 000000000384f6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_WAITPKG
> > 96072: 0000000003850640    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_LZCNT
> > 96074: 0000000003850660    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_MOVBE
> > 96080: 00000000038506c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SSE
> > 98344: 000000000385a4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_ENQCMD
> > 99309: 000000000385da40    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_POPCNT
> > 103332: 000000000386f2c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CLFLUSHOPT
> > 103344: 000000000386f380    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_PKU
> > 103352: 000000000386f400    16 OBJECT  LOCAL  DEFAULT   16 _ZL15PTA_AVX512VBMI2
> > 103709: 0000000003870a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
> > 104337: 0000000003873660    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
> > 106315: 000000000387d260    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_NO_TUNE
> > 109183: 000000000388c160    16 OBJECT  LOCAL  DEFAULT   16 _ZL16PTA_AVX5124VNNIW
> > 111159: 0000000003894a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CANNONLAKE
> > 112043: 00000000038994c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
> > 112049: 0000000003899520    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
> > 113040: 000000000389d6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_BF16
> > 21876: 0000000003d8d5c0    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names
> > 31109: 0000000003d8e100    40 OBJECT  LOCAL  DEFAULT   28 _ZL30unspecified_modref_access_node
> > 78193: 0000000003d932e0    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names
> > 78366: 0000000003d93320    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names
> >
> >       Jakub
>
  
Iain Sandoe Nov. 8, 2021, 7:43 a.m. UTC | #7
> On 8 Nov 2021, at 07:16, Richard Biener <richard.guenther@gmail.com> wrote:
> 
> On Fri, Nov 5, 2021 at 5:37 PM Iain Sandoe <iain@sandoe.co.uk> wrote:
>> 
>> 
>> 
>>> On 5 Nov 2021, at 15:25, Jakub Jelinek <jakub@redhat.com> wrote:
>>> 
>>> On Fri, Nov 05, 2021 at 11:31:58AM +0100, Richard Biener wrote:
>>>> On Fri, Nov 5, 2021 at 10:54 AM Jakub Jelinek <jakub@redhat.com> wrote:
>>>>> 
>>>>> On Fri, Nov 05, 2021 at 10:42:05AM +0100, Richard Biener via Gcc-patches wrote:
>>>>>> I had the impression we have support for PCH file relocation to deal with ASLR
>>>>>> at least on some platforms.
>>>>> 
>>>>> Unfortunately we do not, e.g. if you build cc1/cc1plus as PIE on
>>>>> x86_64-linux, PCH will stop working unless one always invokes it with
>>>>> disabled ASLR through personality.
>>>>> 
>>>>> I think this is related to function pointers and pointers to .rodata/.data
>>>>> etc. variables in GC memory, we currently do not relocate that.
>>>>> 
>>>>> What we perhaps could do is (at least assuming all the ELF PT_LOAD segments
>>>>> are adjacent with a single load base for them - I think at least ia64
>>>>> non-PIE binaries were violating this by having .text and .data PT_LOAD
>>>>> segments many terrabytes appart with a whole in between not protected in any
>>>>> way, but dunno if that is for PIEs too), perhaps try in a host
>>>>> specific way remember the address range in which the function pointers and
>>>>> .rodata/.data can exist, remember the extent start and end from PCH generation
>>>>> and on PCH load query those addresses for the current compiler and relocate
>>>>> everything in that extent by the load bias from the last run.
>>>>> But, the assumption for this is that those function and data/rodata pointers
>>>>> in GC memory are actually marked at least as pointers...
>>>> 
>>>> If any such pointers exist they must be marked GTY((skip)) since they do not
>>>> point to GC memory...  So we'd need to invent special-handling for those.
>>>> 
>>>>> Do we e.g. have objects with virtual classes in GC memory and if so, do we
>>>>> catch their virtual table pointers?
>>>> 
>>>> Who knows, but then I don't remember adding stuff that should end in a PCH.
>>> 
>>> So, I've investigated a little bit.
>>> Apparently all the relocation we currently do for PCH is done at PCH write
>>> time, we choose some address range in the address space we think will be likely
>>> mmappable each time successfully, relocate all pointers pointing to GC
>>> memory to point in there and then write that to file, together with the
>>> scalar GTY global vars values and GTY pointers in global vars.
>>> On PCH load, we just try to mmap memory in the right range, fail PCH load if
>>> unsuccessful, and read the GC memory into that range and update scalar and
>>> pointer GTY global vars from what we've recorded.
>>> Patch that made PCH load to fail for PIEs etc. was
>>> https://gcc.gnu.org/legacy-ml/gcc-patches/2003-10/msg01994.html
>>> If we wanted to relocate pointers to functions and .data/.rodata etc.,
>>> ideally we'd create a relocation list of addresses that should be
>>> incremented by the bias and quickly relocate those.
>> 
>> It is hard to judge the relative effort in the two immediately visible solutions:
>> 
>> 1. relocatable PCH
>> 2. taking the tree streamer from the modules implementation, moving its home
>>    to c-family and adding hooks so that each FE can stream its own special trees.
>> 
>> ISTM, that part of the reason people dislike PCH is because the implementation is
>> mixed up with the GC solution - the rendering is non-transparent etc.
> 
> Yes.  In particular it stands in the way of even thinking of doing sth
> different than
> GC for trees.
> 
>> So, in some ways, (2) above would be a better investment - the process of PCH is:
>> generate:
>> “get to the end of parsing a TU” .. stream the AST
>> consume:
>> .. see a header .. stream the PCH AST in if there is one available for the header.
>> 
>> There is no reason for this to be mixed into the GC solution - the read in (currently)
>> happens to an empty TU and there should be nothing in the AST that carries any
>> reference to the compiler’s executable.
> 
> It makes the PCH read-in "cheap" - IIRC Google invested quite some work in
> evaluating different PC* approaches but none in the end made a big enough
> difference.

that was presumably looking for a more efficient streamer - where we care more now
about a less invasive streamer.

>  Given we now have a standards backed PCH-like thing for C++
> with modules and given that for C (besides on Darwin…)

For the record, it’s not actually Darwin that is calling for it - it’s Objective-C (which has
large header uses in much the same way as C++) - so it would/will affect GNUStep
in the same way.

> PCH never made much
> sense I doubt investing into generalizing the C++ module support or making
> PCH relocatable is worth the trouble.

There are certainly things higher on my TODO …
Iain

> 
> 
> Richard.
> 
>> 
>> just 0.02 GBP.
>> Iain
>> 
>> 
>>> 
>>> I wrote following ugly hack:
>>> 
>>> --- ggc-common.c.jj   2021-08-19 11:42:27.365422400 +0200
>>> +++ ggc-common.c      2021-11-05 15:37:51.447222544 +0100
>>> @@ -404,6 +404,9 @@ struct mmap_info
>>> 
>>> /* Write out the state of the compiler to F.  */
>>> 
>>> +char *exestart = (char *) 2;
>>> +char *exeend = (char *) 2;
>>> +
>>> void
>>> gt_pch_save (FILE *f)
>>> {
>>> @@ -458,6 +461,14 @@ gt_pch_save (FILE *f)
>>>    for (rti = *rt; rti->base != NULL; rti++)
>>>      if (fwrite (rti->base, rti->stride, 1, f) != 1)
>>>      fatal_error (input_location, "cannot write PCH file: %m");
>>> +      else if ((((uintptr_t) rti->base) & (sizeof (void *) - 1)) == 0)
>>> +        {
>>> +          char *const *p = (char *const *) rti->base;
>>> +          char *const *q = (char *const *) ((uintptr_t) rti->base + (rti->stride & ~(sizeof (void *) - 1)));
>>> +          for (; p < q; p++)
>>> +         if (*p >= exestart && *p < exeend)
>>> +           fprintf (stderr, "scalar at %p points to executable %p\n", (void *) p, (void *) *p);
>>> +        }
>>> 
>>>  /* Write out all the global pointers, after translation.  */
>>>  write_pch_globals (gt_ggc_rtab, &state);
>>> @@ -546,6 +557,15 @@ gt_pch_save (FILE *f)
>>>      state.ptrs[i]->note_ptr_fn (state.ptrs[i]->obj,
>>>                                state.ptrs[i]->note_ptr_cookie,
>>>                                relocate_ptrs, &state);
>>> +      if ((((uintptr_t) state.ptrs[i]->obj) & (sizeof (void *) - 1)) == 0)
>>> +        {
>>> +          char *const *p = (char *const *) (state.ptrs[i]->obj);
>>> +          char *const *q = (char *const *) ((uintptr_t) (state.ptrs[i]->obj) + (state.ptrs[i]->size & ~(sizeof (void *) - 1)));
>>> +          for (; p < q; p++)
>>> +         if (*p >= exestart && *p < exeend)
>>> +           fprintf (stderr, "object %p at %p points to executable %p\n", (void *) (state.ptrs[i]->obj), (void *) p, (void *) *p);
>>> +        }
>>> +
>>>      ggc_pch_write_object (state.d, state.f, state.ptrs[i]->obj,
>>>                          state.ptrs[i]->new_addr, state.ptrs[i]->size,
>>>                          state.ptrs[i]->note_ptr_fn == gt_pch_p_S);
>>> 
>>> and under debugger set exestart and exeend from /proc/*/maps of the cc1plus
>>> process being debugged (the extent of cc1plus mappings).
>>> This resulted in something like:
>>> scalar at 0x3d869a8 points to executable 0x2dd85e0
>>> scalar at 0x3d869b0 points to executable 0x2dd85e4
>>> scalar at 0x3d869c8 points to executable 0x2dd85e7
>>> ...
>>> object 0x7fffea007e70 at 0x7fffea007e70 points to executable 0x11e48c2
>>> object 0x7fffe953dcc0 at 0x7fffe953dcc0 points to executable 0x201e222
>>> object 0x7fffe401d260 at 0x7fffe401d260 points to executable 0x4b0a27
>>> object 0x7fffea02fce0 at 0x7fffea02fce0 points to executable 0x18bb2b0
>>> object 0x7fffe7034ca0 at 0x7fffe7034ca0 points to executable 0x2f81537
>>> object 0x7fffe700f8a0 at 0x7fffe700f8a0 points to executable 0x2c36a32
>>> on stderr.  Unfortunately, I didn't try to rebuild the compiler as PIE, so
>>> unfortunately the range was 0x400000 .. 0x3d9b000 so I'm not really sure
>>> if all it dumped were actually addresses or some nice numbers like 0x1000000
>>> etc.  Much better would be to have the compiler as PIE, run it twice and
>>> only look at values that actually changed, or link the compiler at some very
>>> unlikely virtual address offset so that addresses into it would be easy to
>>> spot.
>>> All the "scalar at " messages are for offsets in the ovl_op_info
>>> array.
>>> struct GTY(()) ovl_op_info_t {
>>> /* The IDENTIFIER_NODE for the operator.  */
>>> tree identifier;
>>> /* The name of the operator.  */
>>> const char *name;
>>> /* The mangled name of the operator.  */
>>> const char *mangled_name;
>>> /* The (regular) tree code.  */
>>> enum tree_code tree_code : 16;
>>> /* The (compressed) operator code.  */
>>> enum ovl_op_code ovl_op_code : 8;
>>> /* The ovl_op_flags of the operator */
>>> unsigned flags : 8;
>>> };
>>> For that particular case gengtype emits:
>>> {
>>>   &ovl_op_info[0][0].identifier,
>>>   1 * (2) * (OVL_OP_MAX),
>>>   sizeof (ovl_op_info[0][0]),
>>>   &gt_ggc_mx_tree_node,
>>>   &gt_pch_nx_tree_node
>>> },
>>> {
>>>   &ovl_op_info[0][0].name,
>>>   1 * (2) * (OVL_OP_MAX),
>>>   sizeof (ovl_op_info[0][0]),
>>>   (gt_pointer_walker) &gt_ggc_m_S,
>>>   (gt_pointer_walker) &gt_pch_n_S
>>> },
>>> {
>>>   &ovl_op_info[0][0].mangled_name,
>>>   1 * (2) * (OVL_OP_MAX),
>>>   sizeof (ovl_op_info[0][0]),
>>>   (gt_pointer_walker) &gt_ggc_m_S,
>>>   (gt_pointer_walker) &gt_pch_n_S
>>> },
>>> so I believe we treat the identifier as always a GC memory object pointer,
>>> and name and mangled_name are const char * pointers which I vaguely remember
>>> we allow to be either NULL, or 1 or GC memory pointers or string literals
>>> (but can't find how it deals with that last category in the source).
>>> From the source:
>>> ovl_op_info_t ovl_op_info[2][OVL_OP_MAX] =
>>> {
>>>   {
>>>     {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
>>>     {NULL_TREE, NULL, NULL, NOP_EXPR, OVL_OP_NOP_EXPR, 0},
>>> #define DEF_OPERATOR(NAME, CODE, MANGLING, FLAGS) \
>>>     {NULL_TREE, NAME, MANGLING, CODE, OVL_OP_##CODE, FLAGS},
>>> #define OPERATOR_TRANSITION }, {                        \
>>>     {NULL_TREE, NULL, NULL, ERROR_MARK, OVL_OP_ERROR_MARK, 0},
>>> #include "operators.def"
>>>   }
>>> };
>>> where operators.def has e.g.:
>>> DEF_OPERATOR ("new", NEW_EXPR, "nw", OVL_OP_FLAG_ALLOC)
>>> in this particular array the strings are always string literals.
>>> I guess to get ovl_op_info out of the picture we could mark
>>> name and mangled_name as GTY((skip)).
>>> But that is just 178 records, the remaining 52520 are in GC memory
>>> objects.  Figuring out what exactly it is in will be harder...
>>> From the addresses it printed in the last column, the following point
>>> to the start of some cc1plus symbol:
>>> 3310: 0000000000c121d2   831 FUNC    LOCAL  DEFAULT   14 _ZL9min_vis_rPP9tree_nodePiPv
>>> 134773: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 _Z20ggc_round_alloc_sizem
>>> 6151: 0000000000fa67a9    47 FUNC    GLOBAL DEFAULT   14 _Z20ggc_round_alloc_sizem
>>> 188594: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 _Z4is_aIP7gswitch6gimpleEbPT0_
>>> 37908: 000000000102d0a0    26 FUNC    WEAK   DEFAULT   14 _Z4is_aIP7gswitch6gimpleEbPT0_
>>> 50655: 0000000001707c85    37 FUNC    LOCAL  DEFAULT   14 _ZL20realloc_for_line_mapPvm
>>> 131570: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 _ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
>>> 1653: 000000000178d3e0    66 FUNC    WEAK   DEFAULT   14 _ZNK3vecI13numbered_tree7va_heap6vl_ptrE5spaceEi
>>> 129108: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 _ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
>>> 51650: 000000000178e520    43 FUNC    WEAK   DEFAULT   14 _ZNK3vecI12loc_map_pair7va_heap8vl_embedE5spaceEj
>>> 77141: 0000000001b6cb5a   159 FUNC    LOCAL  DEFAULT   14 _ZL10emit_localP9tree_nodePKcmm
>>> 77142: 0000000001b6cbf9    75 FUNC    LOCAL  DEFAULT   14 _ZL8emit_bssP9tree_nodePKcmm
>>> 77143: 0000000001b6cc44    75 FUNC    LOCAL  DEFAULT   14 _ZL11emit_commonP9tree_nodePKcmm
>>> 77144: 0000000001b6cc8f   231 FUNC    LOCAL  DEFAULT   14 _ZL15emit_tls_commonP9tree_nodePKcmm
>>> 181390: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 _Z21output_section_asm_opPKv
>>> 25347: 0000000001b7e3d0    44 FUNC    GLOBAL DEFAULT   14 _Z21output_section_asm_opPKv
>>> 160243: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC2E11combined_fn
>>> 163230: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC1E11combined_fn
>>> 26343: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC1E11combined_fn
>>> 40584: 0000000001fbc260    27 FUNC    WEAK   DEFAULT   14 _ZN11code_helperC2E11combined_fn
>>> 12547: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
>>> 165150: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC2IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
>>> 181147: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
>>> 26558: 00000000029516e0    68 FUNC    WEAK   DEFAULT   14 _ZNSt4pairIPSt18_Rb_tree_node_baseS1_EC1IRPSt13_Rb_tree_nodeIP15basic_block_defERS1_Lb1EEEOT_OT0_
>>> 8400: 0000000002e13f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_1
>>> 8448: 0000000002e14260    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
>>> 10166: 0000000002e444a0     4 OBJECT  LOCAL  DEFAULT   16 _ZN15zero_regs_flagsL11ALL_GPR_ARGE
>>> 11568: 0000000002e51420    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512FP16
>>> 11735: 0000000002e52f60    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_VPCLMULQDQ
>>> 12575: 0000000002e5f560    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_ROCKETLAKE
>>> 165019: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 class_narrowest_mode
>>> 9991: 0000000002e605a0    20 OBJECT  GLOBAL DEFAULT   16 class_narrowest_mode
>>> 12749: 0000000002e60f60   160 OBJECT  LOCAL  DEFAULT   16 _ZL22extra_order_size_table
>>> 14715: 0000000002e7e340    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SKYLAKE_AVX512
>>> 15895: 0000000002e84480    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_UINTR
>>> 17084: 0000000002e8c160    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SAPPHIRERAPIDS
>>> 18397: 0000000002e946a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_SSE4_2
>>> 18986: 0000000002e97580    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
>>> 18990: 0000000002e975c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL17PTA_GOLDMONT_PLUS
>>> 22195: 0000000002eb1640    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_FMA
>>> 30065: 0000000002eed6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_AVX512BF16
>>> 31474: 0000000002ef3560     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
>>> 34906: 0000000002f02580    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512BW
>>> 37696: 0000000002f0e420     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
>>> 37701: 0000000002f0e484     4 OBJECT  LOCAL  DEFAULT   16 _ZL40LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES
>>> 38868: 0000000002f13420    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNL
>>> 39129: 0000000002f143c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_XSAVES
>>> 40610: 0000000002f1e7c0     1 OBJECT  LOCAL  DEFAULT   16 _ZStL19piecewise_construct
>>> 42157: 0000000002f293c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL16PTA_AVX5124VNNIW
>>> 42201: 0000000002f29680    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_SKYLAKE_AVX512
>>> 42207: 0000000002f296e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL18PTA_ICELAKE_SERVER
>>> 49618: 0000000002f556e0     4 OBJECT  LOCAL  DEFAULT   16 _ZN15zero_regs_flagsL8USED_ARGE
>>> 50904: 0000000002f5d4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_AVX
>>> 51188: 0000000002f5e6e0    48 OBJECT  LOCAL  DEFAULT   16 _ZN12_GLOBAL__N_1L17pass_data_tm_initE
>>> 56440: 0000000002f7d440    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
>>> 57404: 0000000002f81640     4 OBJECT  LOCAL  DEFAULT   16 _ZL14MAX_LOCATION_T
>>> 57424: 0000000002f816a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_64BIT
>>> 60100: 0000000002f903a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_TILE
>>> 67672: 0000000002fae460    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_COOPERLAKE
>>> 68780: 0000000002fb37c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AVX512CD
>>> 70316: 0000000002fbb4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_LWP
>>> 70637: 0000000002fbc7a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
>>> 70837: 0000000002fbd4e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
>>> 73878: 0000000002fcb960    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
>>> 79867: 00000000030435c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_SILVERMONT
>>> 81991: 0000000003053520    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_F16C
>>> 82244: 0000000003054500    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
>>> 86070: 00000000033ec560    99 OBJECT  LOCAL  DEFAULT   16 _ZL26znver1_agu_min_issue_delay
>>> 86071: 00000000033ec5e0  1334 OBJECT  LOCAL  DEFAULT   16 _ZL15geode_translate
>>> 86228: 00000000034419c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_NO_80387
>>> 94224: 0000000003849420    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
>>> 94230: 0000000003849480    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_XOP
>>> 94647: 000000000384aa40    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SGX
>>> 95488: 000000000384e4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_KNM
>>> 95820: 000000000384f6a0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_WBNOINVD
>>> 95822: 000000000384f6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
>>> 95824: 000000000384f6e0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_WAITPKG
>>> 96072: 0000000003850640    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_LZCNT
>>> 96074: 0000000003850660    16 OBJECT  LOCAL  DEFAULT   16 _ZL9PTA_MOVBE
>>> 96080: 00000000038506c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_SSE
>>> 98344: 000000000385a4c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_ENQCMD
>>> 99309: 000000000385da40    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_POPCNT
>>> 103332: 000000000386f2c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CLFLUSHOPT
>>> 103344: 000000000386f380    16 OBJECT  LOCAL  DEFAULT   16 _ZL7PTA_PKU
>>> 103352: 000000000386f400    16 OBJECT  LOCAL  DEFAULT   16 _ZL15PTA_AVX512VBMI2
>>> 103709: 0000000003870a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL8PTA_SSE3
>>> 104337: 0000000003873660    16 OBJECT  LOCAL  DEFAULT   16 _ZL10PTA_HRESET
>>> 106315: 000000000387d260    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_NO_TUNE
>>> 109183: 000000000388c160    16 OBJECT  LOCAL  DEFAULT   16 _ZL16PTA_AVX5124VNNIW
>>> 111159: 0000000003894a40    16 OBJECT  LOCAL  DEFAULT   16 _ZL14PTA_CANNONLAKE
>>> 112043: 00000000038994c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL11PTA_PTWRITE
>>> 112049: 0000000003899520    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_CLDEMOTE
>>> 113040: 000000000389d6c0    16 OBJECT  LOCAL  DEFAULT   16 _ZL12PTA_AMX_BF16
>>> 21876: 0000000003d8d5c0    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names
>>> 31109: 0000000003d8e100    40 OBJECT  LOCAL  DEFAULT   28 _ZL30unspecified_modref_access_node
>>> 78193: 0000000003d932e0    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names
>>> 78366: 0000000003d93320    56 OBJECT  LOCAL  DEFAULT   28 _ZL22mem_alloc_origin_names
>>> 
>>>      Jakub
  
Jakub Jelinek Nov. 8, 2021, 11:46 a.m. UTC | #8
On Fri, Nov 05, 2021 at 04:37:09PM +0000, Iain Sandoe wrote:
> It is hard to judge the relative effort in the two immediately visible solutions:
> 
> 1. relocatable PCH
> 2. taking the tree streamer from the modules implementation, moving its home
>     to c-family and adding hooks so that each FE can stream its own special trees.
> 
> ISTM, that part of the reason people dislike PCH is because the implementation is
> mixed up with the GC solution - the rendering is non-transparent etc.
> 
> So, in some ways, (2) above would be a better investment - the process of PCH is:
> generate:
> “get to the end of parsing a TU” .. stream the AST
> consume:
> .. see a header .. stream the PCH AST in if there is one available for the header.
> 
> There is no reason for this to be mixed into the GC solution - the read in (currently)
> happens to an empty TU and there should be nothing in the AST that carries any
> reference to the compiler’s executable.

I'm afraid (2) is much more work though even just for C++, because handling
just modules streaming and arbitrary headers is quite different.

Anyway, I've rebuilt my cc1plus as PIE (and am invoking it under gdb wrapper which
has ASLR disabled when building x86_64-pc-linux-gnu/bits/stdc++.h.gch/O2g.gch).
With the hack patch I've posted earlier, the results are much shorter than
before, in particular those scalar at messages only for ovl_op_info array
and then
object 0x7fffe9f6b3c0 at 0x7fffe9f6b3c8 points to executable 0x555556a2a180
object 0x7fffe9f6b3c0 at 0x7fffe9f6b3d0 points to executable 0x55555772f9b9
object 0x7fffe9f6b3a0 at 0x7fffe9f6b3a8 points to executable 0x555556a2a180
object 0x7fffe9f6b3a0 at 0x7fffe9f6b3b0 points to executable 0x55555767da08
object 0x7fffe9f6b400 at 0x7fffe9f6b408 points to executable 0x555556a2a180
object 0x7fffe9f6b400 at 0x7fffe9f6b410 points to executable 0x55555772f9d2
object 0x7fffe9f6b480 at 0x7fffe9f6b488 points to executable 0x555556a306d0
object 0x7fffe9f6b3e0 at 0x7fffe9f6b3e8 points to executable 0x555556a2a180
object 0x7fffe9f6b3e0 at 0x7fffe9f6b3f0 points to executable 0x55555772f9c0
object 0x7fffe9f6b420 at 0x7fffe9f6b428 points to executable 0x555556a2b880
object 0x7fffe9f6b440 at 0x7fffe9f6b448 points to executable 0x555556a2b7a0
object 0x7fffe9f6b460 at 0x7fffe9f6b468 points to executable 0x555556a30710
object 0x7fffe9f79168 at 0x7fffe9f791d8 points to executable 0x5555576832b9
object 0x7ffff7fca000 at 0x7ffff7fca048 points to executable 0x55555670b7d0
object 0x7ffff7fca000 at 0x7ffff7fca050 points to executable 0x5555561eb040
If I look at the unique addresses in the last column after subtracing my
PIE base of 0x555555554000, they are:
0000000c97040	_Z20ggc_round_alloc_sizem
00000011b77d0	_ZL20realloc_for_line_mapPvm
00000014d6180	_Z21output_section_asm_opPKv
00000014d77a0	_ZL10emit_localP9tree_nodePKcmm
00000014d7880	_ZL15emit_tls_commonP9tree_nodePKcmm
00000014dc6d0	_ZL8emit_bssP9tree_nodePKcmm
00000014dc710	_ZL11emit_commonP9tree_nodePKcmm
0000002129a08	"\t.text"
000000212f2b9	"GNU C++17"
00000021db9b9	"\t.data"
00000021db9c0	"\t.section"
00000021db9d2	"\t.bss"

For ovl_op_info array, I've mentioned that the array has:
struct GTY(()) ovl_op_info_t {
  tree identifier;
  const char *name;
  const char *mangled_name;
  // And a bunch of scalar members.
};
and while .name and .mangled_name are initially initialized to
NULL or string literals, init_operators then (at least in my understanding
not based on any command line switches and therefore probably always the
same way) reorders some of the elements plus creates those identifier trees.
I said I didn't know what exactly PCH does with const char * or char *
members.  Looking in more detail, gt_ggc_m_S clearly supports:
1) NULL
2) non-GC addresses (so most likely const literals):
  /* Look up the page on which the object is alloced.  If it was not
     GC allocated, gracefully bail out.  */
  entry = safe_lookup_page_table_entry (p);
  if (!entry)
    return;
3) GC addresses not pointing to start of objects - here it assumes
   it points to STRING_CST payload and marks the STRING_CST
4) GC addresses which are starts of objects
And then as can be seen in gt_pch_note_object during PCH, it only
has an exception for NULL and (void *) 1, otherwise for gt_pch_p_S
it remembers the pointer and uses strlen (pointer) + 1 to determine
the size.  While I haven't verified it yet, my understanding is that
if PCH save is done and some GC object or e.g. that
ovl_op_info[?][?].{,mangled_}name points to a .rodata string literal
that when the PCH is saved, we actually make the .gch file not point
it to the string literal in .rodata, but allocate in GC that string
literal and so when PCH is loaded, they will point to some GC allocated
memory containing a copy of that string literal.
So, in theory ovl_op_info would be fine, my printout happens for
scalars when saving the scalar data, but after that we do
write_pch_globals and we have:
  {
    &ovl_op_info[0][0].identifier,
    1 * (2) * (OVL_OP_MAX),
    sizeof (ovl_op_info[0][0]),
    &gt_ggc_mx_tree_node,
    &gt_pch_nx_tree_node
  },
  {
    &ovl_op_info[0][0].name,
    1 * (2) * (OVL_OP_MAX),
    sizeof (ovl_op_info[0][0]),
    (gt_pointer_walker) &gt_ggc_m_S,
    (gt_pointer_walker) &gt_pch_n_S
  },
  {
    &ovl_op_info[0][0].mangled_name,
    1 * (2) * (OVL_OP_MAX),
    sizeof (ovl_op_info[0][0]),
    (gt_pointer_walker) &gt_ggc_m_S,
    (gt_pointer_walker) &gt_pch_n_S
  },
registered which after restoring those overwrites those 3 members
with something different (for the latter two with those pointers
to GC allocated copies of the string literals).
Sure, we could move the ovl_op_info[?][?].identifier out of
the structure to another array, which would be GTY marked and this
one wouldn't, or name and mangled_name could be changed from const char *
to char [10] and char [12] arrays.

Now, looking at the fortunately just very few other pointers into the
PIE.
We clearly have:
typedef bool (*noswitch_section_callback) (tree decl, const char *name,
                                           unsigned HOST_WIDE_INT size,
                                           unsigned HOST_WIDE_INT rounded);
struct GTY(()) noswitch_section {
  struct section_common common;
  noswitch_section_callback GTY ((skip)) callback;
};
which covers
00000014d77a0   _ZL10emit_localP9tree_nodePKcmm
00000014d7880   _ZL15emit_tls_commonP9tree_nodePKcmm
00000014dc6d0   _ZL8emit_bssP9tree_nodePKcmm
00000014dc710   _ZL11emit_commonP9tree_nodePKcmm
typedef void *(*line_map_realloc) (void *, size_t);
typedef size_t (*line_map_round_alloc_size_func) (size_t);
class GTY(()) line_maps {
...
  line_map_realloc reallocator;
  line_map_round_alloc_size_func round_alloc_size;
...
};
which covers
0000000c97040   _Z20ggc_round_alloc_sizem
00000011b77d0   _ZL20realloc_for_line_mapPvm
typedef void (*unnamed_section_callback) (const void *);
struct GTY(()) unnamed_section {
  struct section_common common;
  unnamed_section_callback GTY ((skip)) callback;
  const void *GTY ((skip)) data;
  section *next;
};
which covers
00000014d6180   _Z21output_section_asm_opPKv
For the strings, I wonder about
struct GTY(()) tree_translation_unit_decl {
  struct tree_decl_common common;
  const char * GTY((skip(""))) language;
};
(whether we don't want to drop that GTY((skip(""))) stuff.
And the remaining one is again above, the data pointers
passed to get_unnnamed_section.

So, if we want to make PCH work for PIEs, I'd say we can:
1) add a new GTY option, say callback, which would act like
   skip for non-PCH and for PCH would make us skip it but
   remember for address bias translation
2) drop the skip for tree_translation_unit_decl::language
3) change get_unnamed_section to have const char * as
   last argument instead of const void *, change
   unnamed_section::data also to const char * and update
   everything related to that
4) maybe add a host hook whether it is ok to support binaries
   changing addresses (the only thing I'm worried is if
   some host that uses function descriptors allocates them
   dynamically instead of having them somewhere in the
   executable)
5) maybe add a gengtype warning if it sees in GTY tracked
   structure a function pointer without that new callback
   option

	Jakub