Message ID | 20230125210636.2960049-5-ben.boeckel@kitware.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CE81C392AC20 for <patchwork@sourceware.org>; Wed, 25 Jan 2023 21:08:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CE81C392AC20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1674680924; bh=lR2F06PJ2gLZr4ZqFEDl+Wb59mKqMF2JV32cxeIuAIk=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=gZkvUmkuPrFuwlr8Kqh8fIihcUpmvzBWLPRalEIsTGCW2b6udMwL2IFqYGLkKouP6 Jjs0mU5mUom/s0pnPISNXSpkatUsV99FYFOkcJTs6VBmZmgvFGVfTK0+AxXFm3N6vh WamnXCuw+tb8aSF1uaMeCHv6LbiiFMnEAuIdJ1TQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qt1-x834.google.com (mail-qt1-x834.google.com [IPv6:2607:f8b0:4864:20::834]) by sourceware.org (Postfix) with ESMTPS id B72533857C55 for <gcc-patches@gcc.gnu.org>; Wed, 25 Jan 2023 21:06:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B72533857C55 Received: by mail-qt1-x834.google.com with SMTP id x5so17201652qti.3 for <gcc-patches@gcc.gnu.org>; Wed, 25 Jan 2023 13:06:58 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lR2F06PJ2gLZr4ZqFEDl+Wb59mKqMF2JV32cxeIuAIk=; b=2a/rUvB2hKUzqfX5TJYhj1oP0rAfiyIr+UBJD0HdvPBoiVfR1PbQ1VZ7D5lF7yCwDQ HzlRZiPnBiLP3uKhra1vGjya0tWYmrmeLeJKVlvGGyCJoG/A2NHSXIrBlqr8ufugL5tM hhEPQe2e4ngP5rxKM4G5VhCvAyp59De1tLxx/qXvdA53XdVu4g9yWUNR7rhoSOIFsCM2 qN2xhIBxYxJgD+2bg73LJpNMaTvjCwWdRbKlclZjfdml9Zbj9c1EJJADLNl5eG33zBMs R5oNJhwORUd/pn4tr34AOhzCQrwSA3o37tEjQjnqpj0dXXfVTFa1Gqy7UETknVxb6bWx MFvw== X-Gm-Message-State: AFqh2kqvn8Ea4O4flDXPdQs2oqnbkJzzy8hLLrJlFmUOvnDUyOX9weNT t+ehiGBk7ZiZVHCwUwiDVP80MvKMA3kcvpNB8Fw= X-Google-Smtp-Source: AMrXdXu5CquxNt5/DkBE6qkSCjm76Sl61JdDbnZuC1GOZng/z+wBzy72rwC0qlszWRZv0/8YVHu+iA== X-Received: by 2002:ac8:6643:0:b0:3b4:14c8:8e69 with SMTP id j3-20020ac86643000000b003b414c88e69mr43111068qtp.38.1674680818076; Wed, 25 Jan 2023 13:06:58 -0800 (PST) Received: from localhost (cpe-142-105-146-128.nycap.res.rr.com. [142.105.146.128]) by smtp.gmail.com with ESMTPSA id i7-20020ac84887000000b003b6a17e1996sm4026473qtq.83.2023.01.25.13.06.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Jan 2023 13:06:57 -0800 (PST) To: gcc-patches@gcc.gnu.org Cc: Ben Boeckel <ben.boeckel@kitware.com>, jason@redhat.com, nathan@acm.org, fortran@gcc.gnu.org, gcc@gcc.gnu.org, brad.king@kitware.com Subject: [PATCH v5 4/5] c++modules: report imported CMI files as dependencies Date: Wed, 25 Jan 2023 16:06:35 -0500 Message-Id: <20230125210636.2960049-5-ben.boeckel@kitware.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230125210636.2960049-1-ben.boeckel@kitware.com> References: <20230125210636.2960049-1-ben.boeckel@kitware.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Ben Boeckel via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Ben Boeckel <ben.boeckel@kitware.com> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
P1689R5 support
|
|
Commit Message
Ben Boeckel
Jan. 25, 2023, 9:06 p.m. UTC
They affect the build, so report them via `-MF` mechanisms.
gcc/cp/
* module.cc (do_import): Report imported CMI files as
dependencies.
Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com>
---
gcc/cp/module.cc | 2 ++
1 file changed, 2 insertions(+)
Comments
On 1/25/23 13:06, Ben Boeckel wrote: > They affect the build, so report them via `-MF` mechanisms. > > gcc/cp/ > > * module.cc (do_import): Report imported CMI files as > dependencies. Both this and the mapper dependency patch seem to cause most of the modules testcases to crash; please remember to run the regression tests (https://gcc.gnu.org/contribute.html#testing) > Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com> > --- > gcc/cp/module.cc | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc > index ebd30f63d81..dbd1b721616 100644 > --- a/gcc/cp/module.cc > +++ b/gcc/cp/module.cc > @@ -18966,6 +18966,8 @@ module_state::do_import (cpp_reader *reader, bool outermost) > dump () && dump ("CMI is %s", file); > if (note_module_cmi_yes || inform_cmi_p) > inform (loc, "reading CMI %qs", file); > + /* Add the CMI file to the dependency tracking. */ > + deps_add_dep (cpp_get_deps (reader), file); > fd = open (file, O_RDONLY | O_CLOEXEC | O_BINARY); > e = errno; > }
On Mon, Feb 13, 2023 at 13:33:50 -0500, Jason Merrill wrote: > Both this and the mapper dependency patch seem to cause most of the > modules testcases to crash; please remember to run the regression tests > (https://gcc.gnu.org/contribute.html#testing) Fixed for v6. `cpp_get_deps` can return `NULL` which `deps_add_dep` assumes to not be true; fixed by checking before calling. --Ben
On 1/25/23 16:06, Ben Boeckel wrote: > They affect the build, so report them via `-MF` mechanisms. Why isn't this covered by the existing code in preprocessed_module? > gcc/cp/ > > * module.cc (do_import): Report imported CMI files as > dependencies. > > Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com> > --- > gcc/cp/module.cc | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc > index ebd30f63d81..dbd1b721616 100644 > --- a/gcc/cp/module.cc > +++ b/gcc/cp/module.cc > @@ -18966,6 +18966,8 @@ module_state::do_import (cpp_reader *reader, bool outermost) > dump () && dump ("CMI is %s", file); > if (note_module_cmi_yes || inform_cmi_p) > inform (loc, "reading CMI %qs", file); > + /* Add the CMI file to the dependency tracking. */ > + deps_add_dep (cpp_get_deps (reader), file); > fd = open (file, O_RDONLY | O_CLOEXEC | O_BINARY); > e = errno; > }
On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote: > On 1/25/23 16:06, Ben Boeckel wrote: > > They affect the build, so report them via `-MF` mechanisms. > > Why isn't this covered by the existing code in preprocessed_module? It appears as though it is neutered in patch 3 where `write_make_modules_deps` is used in `make_write` (or will use that name in v7 once I finish up testing). This logic cannot be used for p1689 output because it assumes the location and names of CMI files (`.c++m`) that will be necessary (it is related to the `CXX_IMPORTS +=` GNU make/libcody extensions that will, e.g., cause `ninja` to choke if it is read from `-MF` output as it uses "fancier" Makefile syntax than tools that are not actually `make` are going to be willing to support). This codepath is the *actual* filename being read at compile time and is relevant at all times; it may duplicate what `preprocessed_module` sets up. I'm also realizing that this is why I need to pass `-fdeps-format=p1689` when compiling…there may need to be another, more idiomatic, way to disable this additional syntax usage in `-MF` output. --Ben
On 6/22/23 22:45, Ben Boeckel wrote: > On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote: >> On 1/25/23 16:06, Ben Boeckel wrote: >>> They affect the build, so report them via `-MF` mechanisms. >> >> Why isn't this covered by the existing code in preprocessed_module? > > It appears as though it is neutered in patch 3 where > `write_make_modules_deps` is used in `make_write` (or will use that name Why do you want to record the transitive modules? I would expect just noting the ones with imports directly in the TU would suffice (i.e check the 'outermost' arg) nathan
On Fri, Jun 23, 2023 at 08:12:41 -0400, Nathan Sidwell wrote: > On 6/22/23 22:45, Ben Boeckel wrote: > > On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote: > >> On 1/25/23 16:06, Ben Boeckel wrote: > >>> They affect the build, so report them via `-MF` mechanisms. > >> > >> Why isn't this covered by the existing code in preprocessed_module? > > > > It appears as though it is neutered in patch 3 where > > `write_make_modules_deps` is used in `make_write` (or will use that name > > Why do you want to record the transitive modules? I would expect just noting the > ones with imports directly in the TU would suffice (i.e check the 'outermost' arg) FWIW, only GCC has "fat" modules. MSVC and Clang both require the transitive closure to be passed. The idea there is to minimize the size of individual module files. If GCC only reads the "fat" modules, then only those should be recorded. If it reads other modules, they should be recorded as well. --Ben
On 6/25/23 12:36, Ben Boeckel wrote: > On Fri, Jun 23, 2023 at 08:12:41 -0400, Nathan Sidwell wrote: >> On 6/22/23 22:45, Ben Boeckel wrote: >>> On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote: >>>> On 1/25/23 16:06, Ben Boeckel wrote: >>>>> They affect the build, so report them via `-MF` mechanisms. >>>> >>>> Why isn't this covered by the existing code in preprocessed_module? >>> >>> It appears as though it is neutered in patch 3 where >>> `write_make_modules_deps` is used in `make_write` (or will use that name >> >> Why do you want to record the transitive modules? I would expect just noting the >> ones with imports directly in the TU would suffice (i.e check the 'outermost' arg) > > FWIW, only GCC has "fat" modules. MSVC and Clang both require the > transitive closure to be passed. The idea there is to minimize the size > of individual module files. > > If GCC only reads the "fat" modules, then only those should be recorded. > If it reads other modules, they should be recorded as well. But wouldn't the transitive modules be dependencies of the direct imports, so (re)building the direct imports would first require building the transitive modules anyway? Expressing the transitive closure of dependencies for each importer seems redundant when it can be easily derived from the direct dependencies of each module. Jason
On 7/18/23 16:52, Jason Merrill wrote: > On 6/25/23 12:36, Ben Boeckel wrote: >> On Fri, Jun 23, 2023 at 08:12:41 -0400, Nathan Sidwell wrote: >>> On 6/22/23 22:45, Ben Boeckel wrote: >>>> On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote: >>>>> On 1/25/23 16:06, Ben Boeckel wrote: >>>>>> They affect the build, so report them via `-MF` mechanisms. >>>>> >>>>> Why isn't this covered by the existing code in preprocessed_module? >>>> >>>> It appears as though it is neutered in patch 3 where >>>> `write_make_modules_deps` is used in `make_write` (or will use that name >>> >>> Why do you want to record the transitive modules? I would expect just noting the >>> ones with imports directly in the TU would suffice (i.e check the 'outermost' >>> arg) >> >> FWIW, only GCC has "fat" modules. MSVC and Clang both require the >> transitive closure to be passed. The idea there is to minimize the size >> of individual module files. >> >> If GCC only reads the "fat" modules, then only those should be recorded. >> If it reads other modules, they should be recorded as well. Please explain what you mean by fat modules. There seems to be confusion. > > But wouldn't the transitive modules be dependencies of the direct imports, so > (re)building the direct imports would first require building the transitive > modules anyway? Expressing the transitive closure of dependencies for each > importer seems redundant when it can be easily derived from the direct > dependencies of each module. > > Jason >
On Tue, Jul 18, 2023 at 16:52:44 -0400, Jason Merrill wrote: > On 6/25/23 12:36, Ben Boeckel wrote: > > On Fri, Jun 23, 2023 at 08:12:41 -0400, Nathan Sidwell wrote: > >> On 6/22/23 22:45, Ben Boeckel wrote: > >>> On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote: > >>>> On 1/25/23 16:06, Ben Boeckel wrote: > >>>>> They affect the build, so report them via `-MF` mechanisms. > >>>> > >>>> Why isn't this covered by the existing code in preprocessed_module? > >>> > >>> It appears as though it is neutered in patch 3 where > >>> `write_make_modules_deps` is used in `make_write` (or will use that name > >> > >> Why do you want to record the transitive modules? I would expect just noting the > >> ones with imports directly in the TU would suffice (i.e check the 'outermost' arg) > > > > FWIW, only GCC has "fat" modules. MSVC and Clang both require the > > transitive closure to be passed. The idea there is to minimize the size > > of individual module files. > > > > If GCC only reads the "fat" modules, then only those should be recorded. > > If it reads other modules, they should be recorded as well. For clarification, given: * a.cppm ``` export module a; ``` * b.cppm ``` export module b; import a; ``` * use.cppm ``` import b; ``` in a "fat" module setup, `use.cppm` only needs to be told about `b.cmi` because it contains everything that an importer needs to know about the `a` module (reachable types, re-exported bits, whatever). With the "thin" modules, `a.cmi` must be specified when compiling `use.cppm` to satisfy anything that may be required transitively (e.g., a return type of some `b` symbol is from `a`). MSVC and Clang (17+) use this model. I don't know MSVC's rationale, but Clang's is to make CMIs relocatable by not embedding paths to dependent modules in CMIs. This should help caching and network transfer sizes in distributed builds. LLVM/Clang discussion: https://discourse.llvm.org/t/c-20-modules-should-the-bmis-contain-paths-to-their-dependent-bmis/70422 https://github.com/llvm/llvm-project/issues/62707 Maybe I'm missing how this *actually* works in GCC as I've really only interacted with it through the command line, but I've not needed to mention `a.cmi` when compiling `use.cppm`. Is `a.cmi` referenced and read through some embedded information in `b.cmi` or does `b.cmi` include enough information to not need to read it at all? If the former, distributed builds are going to have a problem knowing what files to send just from the command line (I'll call this "implicit thin"). If the latter, that is the "fat" CMI that I'm thinking of. > But wouldn't the transitive modules be dependencies of the direct > imports, so (re)building the direct imports would first require building > the transitive modules anyway? Expressing the transitive closure of > dependencies for each importer seems redundant when it can be easily > derived from the direct dependencies of each module. I'm not concerned whether it is transitive or not, really. If a file is read, it should be reported here regardless of the reason. Note that caching mechanisms may skip actually *doing* the reading, but the dependencies should still be reported from the cached results as-if the real work had been performed. --Ben
On 7/18/23 20:01, Ben Boeckel wrote: > On Tue, Jul 18, 2023 at 16:52:44 -0400, Jason Merrill wrote: >> On 6/25/23 12:36, Ben Boeckel wrote: >>> On Fri, Jun 23, 2023 at 08:12:41 -0400, Nathan Sidwell wrote: >>>> On 6/22/23 22:45, Ben Boeckel wrote: >>>>> On Thu, Jun 22, 2023 at 17:21:42 -0400, Jason Merrill wrote: >>>>>> On 1/25/23 16:06, Ben Boeckel wrote: >>>>>>> They affect the build, so report them via `-MF` mechanisms. >>>>>> >>>>>> Why isn't this covered by the existing code in preprocessed_module? >>>>> >>>>> It appears as though it is neutered in patch 3 where >>>>> `write_make_modules_deps` is used in `make_write` (or will use that name >>>> >>>> Why do you want to record the transitive modules? I would expect just noting the >>>> ones with imports directly in the TU would suffice (i.e check the 'outermost' arg) >>> >>> FWIW, only GCC has "fat" modules. MSVC and Clang both require the >>> transitive closure to be passed. The idea there is to minimize the size >>> of individual module files. >>> >>> If GCC only reads the "fat" modules, then only those should be recorded. >>> If it reads other modules, they should be recorded as well. > > For clarification, given: > > * a.cppm > ``` > export module a; > ``` > > * b.cppm > ``` > export module b; > import a; > ``` > > * use.cppm > ``` > import b; > ``` > > in a "fat" module setup, `use.cppm` only needs to be told about > `b.cmi` because it contains everything that an importer needs to know > about the `a` module (reachable types, re-exported bits, whateve > With > the "thin" modules, `a.cmi` must be specified when compiling `use.cppm` > to satisfy anything that may be required transitively (e.g., a return GCC is neither of these descriptions. a CMI does not contain the transitive closure of its imports. It contains an import table. That table lists the transitive closure of its imports (it needs that closure to do remapping), and that table contains the CMI pathnames of the direct imports. Those pathnames are absolute, if the mapper provded an absolute pathm or relative to the CMI repo. The rationale here is that if you're building a CMI, Foo, which imports a bunch of modules, those imported CMIs will have the same (relative) location in this compilation and in compilations importing Foo (why would you move them?) Note this is NOT inhibiting relocatable builds, because of the CMI repo. > Maybe I'm missing how this *actually* works in GCC as I've really only > interacted with it through the command line, but I've not needed to > mention `a.cmi` when compiling `use.cppm`. Is `a.cmi` referenced and > read through some embedded information in `b.cmi` or does `b.cmi` > include enough information to not need to read it at all? If the former, > distributed builds are going to have a problem knowing what files to > send just from the command line (I'll call this "implicit thin"). If the > latter, that is the "fat" CMI that I'm thinking of. please don't use perjorative terms like 'fat' and 'thin'. > >> But wouldn't the transitive modules be dependencies of the direct >> imports, so (re)building the direct imports would first require building >> the transitive modules anyway? Expressing the transitive closure of >> dependencies for each importer seems redundant when it can be easily >> derived from the direct dependencies of each module. > > I'm not concerned whether it is transitive or not, really. If a file is > read, it should be reported here regardless of the reason. Note that > caching mechanisms may skip actually *doing* the reading, but the > dependencies should still be reported from the cached results as-if the > real work had been performed. > > --Ben
On Wed, Jul 19, 2023 at 17:11:08 -0400, Nathan Sidwell wrote: > GCC is neither of these descriptions. a CMI does not contain the transitive > closure of its imports. It contains an import table. That table lists the > transitive closure of its imports (it needs that closure to do remapping), and > that table contains the CMI pathnames of the direct imports. Those pathnames > are absolute, if the mapper provded an absolute pathm or relative to the CMI repo. > > The rationale here is that if you're building a CMI, Foo, which imports a bunch > of modules, those imported CMIs will have the same (relative) location in this > compilation and in compilations importing Foo (why would you move them?) Note > this is NOT inhibiting relocatable builds, because of the CMI repo. But it is inhibiting distributed builds because the distributing tool would need to know: - what CMIs are actually imported (here, "read the module mapper file" (in CMake's case, this is only the modules that are needed; a single massive mapper file for an entire project would have extra entries) or "act as a proxy for the socket/program specified" for other approaches); - read the CMIs as it sends to the remote side to gather any other CMIs that may be needed (recursively); Contrast this with the MSVC and Clang (17+) mechanism where the command line contains everything that is needed and a single bolus can be sent. And relocatable is probably fine. How does it interact with reproducible builds? Or are GCC CMIs not really something anyone should consider for installation (even as a "here, maybe this can help consumers" mechanism)? > On 7/18/23 20:01, Ben Boeckel wrote: > > Maybe I'm missing how this *actually* works in GCC as I've really only > > interacted with it through the command line, but I've not needed to > > mention `a.cmi` when compiling `use.cppm`. Is `a.cmi` referenced and > > read through some embedded information in `b.cmi` or does `b.cmi` > > include enough information to not need to read it at all? If the former, > > distributed builds are going to have a problem knowing what files to > > send just from the command line (I'll call this "implicit thin"). If the > > latter, that is the "fat" CMI that I'm thinking of. > > please don't use perjorative terms like 'fat' and 'thin'. Sorry, I was internally analogizing to "thinLTO". --Ben
On 7/19/23 20:47, Ben Boeckel wrote: > On Wed, Jul 19, 2023 at 17:11:08 -0400, Nathan Sidwell wrote: >> GCC is neither of these descriptions. a CMI does not contain the transitive >> closure of its imports. It contains an import table. That table lists the >> transitive closure of its imports (it needs that closure to do remapping), and >> that table contains the CMI pathnames of the direct imports. Those pathnames >> are absolute, if the mapper provded an absolute pathm or relative to the CMI repo. >> >> The rationale here is that if you're building a CMI, Foo, which imports a bunch >> of modules, those imported CMIs will have the same (relative) location in this >> compilation and in compilations importing Foo (why would you move them?) Note >> this is NOT inhibiting relocatable builds, because of the CMI repo. > > But it is inhibiting distributed builds because the distributing tool > would need to know: > > - what CMIs are actually imported (here, "read the module mapper file" > (in CMake's case, this is only the modules that are needed; a single > massive mapper file for an entire project would have extra entries) or > "act as a proxy for the socket/program specified" for other > approaches); This information is in the machine (& human) README section of the CMI. > - read the CMIs as it sends to the remote side to gather any other CMIs > that may be needed (recursively); > > Contrast this with the MSVC and Clang (17+) mechanism where the command > line contains everything that is needed and a single bolus can be sent. um, the build system needs to create that command line? Where does the build system get that information? IIUC it'll need to read some file(s) to do that. > And relocatable is probably fine. How does it interact with reproducible > builds? Or are GCC CMIs not really something anyone should consider for > installation (even as a "here, maybe this can help consumers" > mechanism)? Module CMIs should be considered a cacheable artifact. They are neither object files nor source files. > >> On 7/18/23 20:01, Ben Boeckel wrote: >>> Maybe I'm missing how this *actually* works in GCC as I've really only >>> interacted with it through the command line, but I've not needed to >>> mention `a.cmi` when compiling `use.cppm`. Is `a.cmi` referenced and >>> read through some embedded information in `b.cmi` or does `b.cmi` >>> include enough information to not need to read it at all? If the former, >>> distributed builds are going to have a problem knowing what files to >>> send just from the command line (I'll call this "implicit thin"). If the >>> latter, that is the "fat" CMI that I'm thinking of. >> >> please don't use perjorative terms like 'fat' and 'thin'. > > Sorry, I was internally analogizing to "thinLTO". > > --Ben
On Thu, Jul 20, 2023 at 17:00:32 -0400, Nathan Sidwell wrote: > On 7/19/23 20:47, Ben Boeckel wrote: > > But it is inhibiting distributed builds because the distributing tool > > would need to know: > > > > - what CMIs are actually imported (here, "read the module mapper file" > > (in CMake's case, this is only the modules that are needed; a single > > massive mapper file for an entire project would have extra entries) or > > "act as a proxy for the socket/program specified" for other > > approaches); > > This information is in the machine (& human) README section of the CMI. Ok. That leaves it up to distributing build tools to figure out at least. > > - read the CMIs as it sends to the remote side to gather any other CMIs > > that may be needed (recursively); > > > > Contrast this with the MSVC and Clang (17+) mechanism where the command > > line contains everything that is needed and a single bolus can be sent. > > um, the build system needs to create that command line? Where does the build > system get that information? IIUC it'll need to read some file(s) to do that. It's chained through the P1689 information in the collator as needed. No extra files need to be read (at least with CMake's approach); certainly not CMI files. > > And relocatable is probably fine. How does it interact with reproducible > > builds? Or are GCC CMIs not really something anyone should consider for > > installation (even as a "here, maybe this can help consumers" > > mechanism)? > > Module CMIs should be considered a cacheable artifact. They are neither object > files nor source files. Sure, cachable sounds fine. What about the installation? --Ben
On 7/21/23 10:57, Ben Boeckel wrote: > On Thu, Jul 20, 2023 at 17:00:32 -0400, Nathan Sidwell wrote: >> On 7/19/23 20:47, Ben Boeckel wrote: >>> But it is inhibiting distributed builds because the distributing tool >>> would need to know: >>> >>> - what CMIs are actually imported (here, "read the module mapper file" >>> (in CMake's case, this is only the modules that are needed; a single >>> massive mapper file for an entire project would have extra entries) or >>> "act as a proxy for the socket/program specified" for other >>> approaches); >> >> This information is in the machine (& human) README section of the CMI. > > Ok. That leaves it up to distributing build tools to figure out at > least. > >>> - read the CMIs as it sends to the remote side to gather any other CMIs >>> that may be needed (recursively); >>> >>> Contrast this with the MSVC and Clang (17+) mechanism where the command >>> line contains everything that is needed and a single bolus can be sent. >> >> um, the build system needs to create that command line? Where does the build >> system get that information? IIUC it'll need to read some file(s) to do that. > > It's chained through the P1689 information in the collator as needed. No > extra files need to be read (at least with CMake's approach); certainly > not CMI files. It occurs to me that the model I am envisioning is similar to CMake's object libraries. Object libraries are a convenient name for a bunch of object files. IIUC they're linked by naming the individual object files (or I think the could be implemented as a static lib linked with --whole-archive path/to/libfoo.a -no-whole-archive. But for this conversation consider them a bunch of separate object files with a convenient group name. Consider also that object libraries could themselves contain object libraries (I don't know of they can, but it seems like a useful concept). Then one could create an object library from a collection of object files and object libraries (recursively). CMake would handle the transitive gtaph. Now, allow an object library to itself have some kind of tangible, on-disk representation. *BUT* not like a static library -- it doesn't include the object files. Now that immediately maps onto modules. CMI: Object library Direct imports: Direct object libraries of an object library This is why I don't understand the need explicitly indicate the indirect imports of a CMI. CMake knows them, because it knows the graph. > >>> And relocatable is probably fine. How does it interact with reproducible >>> builds? Or are GCC CMIs not really something anyone should consider for >>> installation (even as a "here, maybe this can help consumers" >>> mechanism)? >> >> Module CMIs should be considered a cacheable artifact. They are neither object >> files nor source files. > > Sure, cachable sounds fine. What about the installation? > > --Ben
On Fri, Jul 21, 2023 at 16:23:07 -0400, Nathan Sidwell wrote: > It occurs to me that the model I am envisioning is similar to CMake's object > libraries. Object libraries are a convenient name for a bunch of object files. > IIUC they're linked by naming the individual object files (or I think the could > be implemented as a static lib linked with --whole-archive path/to/libfoo.a > -no-whole-archive. But for this conversation consider them a bunch of separate > object files with a convenient group name. Yes, `--whole-archive` would work great if it had any kind of portability across CMake's platform set. > Consider also that object libraries could themselves contain object libraries (I > don't know of they can, but it seems like a useful concept). Then one could > create an object library from a collection of object files and object libraries > (recursively). CMake would handle the transitive gtaph. I think this detail is relevant, but you can use `$<TARGET_OBJECT:subobjlib>` as an `INTERFACE` sources and it would act like that, but it is an explicit thing. Instead, `OBJECT` libraries *only* provide their objects to targets that *directly* link them. If not, given this: A (OBJECT library) B (library of some kind; links PUBLIC to A) C (links to B) If `A` has things like linker flags (or, more likely, libraries) as part of its usage requirements, C will get them on is link line. However, if OBJECT files are transitive in the same way, the linker (on most platforms at least) chokes because it now has duplicates of all of A's symbols: those from the B library and those from A's objects on the link line. > Now, allow an object library to itself have some kind of tangible, on-disk > representation. *BUT* not like a static library -- it doesn't include the > object files. > > > Now that immediately maps onto modules. > > CMI: Object library > Direct imports: Direct object libraries of an object library > > This is why I don't understand the need explicitly indicate the indirect imports > of a CMI. CMake knows them, because it knows the graph. Sure, *CMake* knows them, but the *build tool* needs to be told (typically `make` or `ninja`) because it is what is actually executing the build graph. The way this is communicated is via `-MF` files and that's what I'm providing in this patch. Note that `ninja` does not allow rules to specify such dependencies for other rules than the one it is reading the file for. --Ben
On 7/23/23 20:26, Ben Boeckel wrote: > On Fri, Jul 21, 2023 at 16:23:07 -0400, Nathan Sidwell wrote: >> It occurs to me that the model I am envisioning is similar to CMake's object >> libraries. Object libraries are a convenient name for a bunch of object files. >> IIUC they're linked by naming the individual object files (or I think the could >> be implemented as a static lib linked with --whole-archive path/to/libfoo.a >> -no-whole-archive. But for this conversation consider them a bunch of separate >> object files with a convenient group name. > > Yes, `--whole-archive` would work great if it had any kind of > portability across CMake's platform set. > >> Consider also that object libraries could themselves contain object libraries (I >> don't know of they can, but it seems like a useful concept). Then one could >> create an object library from a collection of object files and object libraries >> (recursively). CMake would handle the transitive gtaph. > > I think this detail is relevant, but you can use > `$<TARGET_OBJECT:subobjlib>` as an `INTERFACE` sources and it would act > like that, but it is an explicit thing. Instead, `OBJECT` libraries > *only* provide their objects to targets that *directly* link them. If > not, given this: > > A (OBJECT library) > B (library of some kind; links PUBLIC to A) > C (links to B) > > If `A` has things like linker flags (or, more likely, libraries) as part > of its usage requirements, C will get them on is link line. However, if > OBJECT files are transitive in the same way, the linker (on most > platforms at least) chokes because it now has duplicates of all of A's > symbols: those from the B library and those from A's objects on the link > line. > >> Now, allow an object library to itself have some kind of tangible, on-disk >> representation. *BUT* not like a static library -- it doesn't include the >> object files. >> >> >> Now that immediately maps onto modules. >> >> CMI: Object library >> Direct imports: Direct object libraries of an object library >> >> This is why I don't understand the need explicitly indicate the indirect imports >> of a CMI. CMake knows them, because it knows the graph. > > Sure, *CMake* knows them, but the *build tool* needs to be told > (typically `make` or `ninja`) because it is what is actually executing > the build graph. The way this is communicated is via `-MF` files and > that's what I'm providing in this patch. Note that `ninja` does not > allow rules to specify such dependencies for other rules than the one it > is reading the file for. But since the direct imports need to be rebuilt themselves if the transitive imports change, the build graph should be the same whether or not the transitive imports are repeated? Either way, if a transitive import changes you need to rebuild the direct import and then the importer. I guess it shouldn't hurt to have the transitive imports in the -MF file, as long as they aren't also in the p1689 file, so I'm not particularly opposed to this change, but I don't see how it makes a practical difference. Jason
On Thu, Jul 27, 2023 at 18:13:48 -0700, Jason Merrill wrote: > On 7/23/23 20:26, Ben Boeckel wrote: > > Sure, *CMake* knows them, but the *build tool* needs to be told > > (typically `make` or `ninja`) because it is what is actually executing > > the build graph. The way this is communicated is via `-MF` files and > > that's what I'm providing in this patch. Note that `ninja` does not > > allow rules to specify such dependencies for other rules than the one it > > is reading the file for. > > But since the direct imports need to be rebuilt themselves if the > transitive imports change, the build graph should be the same whether or > not the transitive imports are repeated? Either way, if a transitive > import changes you need to rebuild the direct import and then the importer. I suppose I have seen enough bad build systems that don't do everything correctly that I'm interested in creating "pits of success" rather than "well, you didn't get thing X 100% correct, so you're screwed here too". The case that I think is most likely here is that someone has a "superbuild" with 3 projects A, B, and C where C uses B and B uses A. At the top-level the superbuild exposes just "make projectA projectB projectC"-granularity (rather than a combined build graph; they may use different build systems) and then users go into some projectC directly and forget to update projectB after updating projectA (known to all use the same compiler/flags so that CMI sharing is possible). The build it still broken, but ideally they get notified in some useful way when rebuilding the TU rather than…whatever ends up catching the problem incidentally. > I guess it shouldn't hurt to have the transitive imports in the -MF > file, as long as they aren't also in the p1689 file, so I'm not > particularly opposed to this change, but I don't see how it makes a > practical difference. Correct. The P1689 shouldn't even know about transitive imports (well, maybe from header units?) as it just records "I saw an `import` statement" and should never look up CMI files (indeed, we would need another scanning step to know what CMI files to create for the P1689 scan if they were necessary…). --Ben
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc index ebd30f63d81..dbd1b721616 100644 --- a/gcc/cp/module.cc +++ b/gcc/cp/module.cc @@ -18966,6 +18966,8 @@ module_state::do_import (cpp_reader *reader, bool outermost) dump () && dump ("CMI is %s", file); if (note_module_cmi_yes || inform_cmi_p) inform (loc, "reading CMI %qs", file); + /* Add the CMI file to the dependency tracking. */ + deps_add_dep (cpp_get_deps (reader), file); fd = open (file, O_RDONLY | O_CLOEXEC | O_BINARY); e = errno; }