[07/10] gdb: add qMachineId packet
Checks
Commit Message
In later commits I want to make two related changes to GDB, these are:
1. Have GDB know when it can safely ignore a 'target:' prefix in the
sysroot, and so avoid copying files from the remote target, and
2. Have GDB know that it can safely use the file specified with the
'file' command to start a remote inferior, rather than requiring the
'set remote exec-file' command to have been used.
Both of these changes require that GDB be able to know if it is
running on the same host as the remote target.
In this commit I propose a mechanism by which this can be achieved,
that is, the introduction of the qMachineId packet.
The idea of the qMachineId packet is that, during the initial
connection phase, GDB will send the qMachineId packet, and the remote
will return a reply that describes the machine the remote target is
running on.
Back on the GDB side, GDB will generate a description of the machine
it is running on and compare this to the reply received from the
remote target.
If the two match then GDB will assume it is running on the same
machine as the remote target, and that it can access the same set of
files, in this case we can enable the two improvements listed above.
If the remote target doesn't support qMachineId, or the reply from the
remote target doesn't match the machine-id generated within GDB, then
GDB will assume that the target is truly remote, just as it does right
now.
This commit does NOT implement the two improvements listed above,
these will be added in follow on commits. This commit just adds
support for the qMachineId packet.
Generating a suitable machine-id is, I think, always going to be
target specific. As such, I've structured the code in a way that
allows different targets to provide their own implementations, but
I've only implemented a solution for the Linux targets.
The reply to a qMachineId packet looks like this:
predicate;key=value[;key=value]*
the idea being that the reply consists of a number of key/value pairs,
each of which must match in order for GDB to consider the machine-id a
match. I currently propose just two keys:
linux-boot-id - this returns the value from the file
/proc/sys/kernel/random/boot_id, which, if I understand correctly,
should be unique(ish) for each boot of each machine, and
cuserid - this returns the value of the cuserid call.
My thinking is that if we know we are on the same machine (thanks to
linux-boot-id), and we know we are the same effective user (thanks to
cuserid) then there's a pretty good chance that GDB and the remote can
access the same set of files.
As well as the 'predicate;' based reply, a remote can respond to a
qMachineId packet with one of these replies:
local
remote
The 'local' reply forces GDB to consider the remote as being local to
GDB. I've documented this as something that should be used with
extreme care, obviously it would be easy for a user to run a remote
non-locally, in which case, if the remote claims to be local, then GDB
is going to try to access the files directly ... but maybe there will
be some use case where this is helpful.
The 'remote' reply is the opposite, it forces GDB to consider the
remote as being truly remote (which is the behaviour we have today).
It's always safe to return this reply, though this prevents GDB from
performing any of the improvements listed above.
For the GDB/gdbserver implementation, the code to generate the values
for the machine-id has been placed in gdb/nat/linux-machine-id.c, and
is shared between GDB and gdbserver.
There are no tests in this commit as there's no new commands, or user
visible behaviour changes (without turning on debug output), that can
be seen. However, later commits will add new functionality, which
will rely on this packet working correctly.
---
gdb/Makefile.in | 3 +
gdb/NEWS | 5 ++
gdb/configure.nat | 2 +-
gdb/doc/gdb.texinfo | 100 ++++++++++++++++++++++
gdb/linux-nat.c | 35 ++++++++
gdb/nat/linux-machine-id.c | 60 +++++++++++++
gdb/nat/linux-machine-id.h | 44 ++++++++++
gdb/remote-machine-id.c | 69 +++++++++++++++
gdb/remote-machine-id.h | 108 ++++++++++++++++++++++++
gdb/remote.c | 169 +++++++++++++++++++++++++++++++++++++
gdbserver/Makefile.in | 1 +
gdbserver/configure.srv | 2 +-
gdbserver/linux-low.cc | 19 +++++
gdbserver/linux-low.h | 2 +
gdbserver/server.cc | 14 +++
gdbserver/target.cc | 8 ++
gdbserver/target.h | 9 ++
17 files changed, 648 insertions(+), 2 deletions(-)
create mode 100644 gdb/nat/linux-machine-id.c
create mode 100644 gdb/nat/linux-machine-id.h
create mode 100644 gdb/remote-machine-id.c
create mode 100644 gdb/remote-machine-id.h
Comments
> Cc: Andrew Burgess <aburgess@redhat.com>
> Date: Wed, 16 Aug 2023 16:55:03 +0100
> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
>
> diff --git a/gdb/NEWS b/gdb/NEWS
> index 9839330c46d..d83f097d937 100644
> --- a/gdb/NEWS
> +++ b/gdb/NEWS
> @@ -272,6 +272,11 @@ qDefaultExecAndArgs
> which the server was started. If no such information was given to
> the server then this is reflected in the reply.
>
> +qMachineId
> + This packet returns an identifier that allows GDB to determine if
> + the remote server and GDB are running on the same host, and can see
> + the same filesystem.
> +
> *** Changes in GDB 13
This part is OK.
> @@ -44748,6 +44749,49 @@
> Indicates an error was encountered.
> @end table
>
> +@anchor{Machine-Id Packet}
> +@item qMachineId
> +@cindex query remote machine-id, remote request
> +@cindex @samp{qMachineId} packet
Please move the @cindex entries before the @item line, so that
index-search in an Info reader lands you on the line corresponding to
@item, not on the text after that.
> +within a single reply. See @ref{Machine-Id Details} for details of
^
Comma there, please (it is needed for some older versions of Texinfo).
Btw, when a sentence starts with "See @ref", you could simplify by
using @xref, as that is its main purpose.
Thanks.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Hello Andrew,
This is a very interesting feature, thanks. For now at least, I have
just one comment on the machine-id for Linux:
Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
> Generating a suitable machine-id is, I think, always going to be
> target specific. As such, I've structured the code in a way that
> allows different targets to provide their own implementations, but
> I've only implemented a solution for the Linux targets.
>
> The reply to a qMachineId packet looks like this:
>
> predicate;key=value[;key=value]*
>
> the idea being that the reply consists of a number of key/value pairs,
> each of which must match in order for GDB to consider the machine-id a
> match. I currently propose just two keys:
>
> linux-boot-id - this returns the value from the file
> /proc/sys/kernel/random/boot_id, which, if I understand correctly,
> should be unique(ish) for each boot of each machine, and
>
> cuserid - this returns the value of the cuserid call.
>
> My thinking is that if we know we are on the same machine (thanks to
> linux-boot-id), and we know we are the same effective user (thanks to
> cuserid) then there's a pretty good chance that GDB and the remote can
> access the same set of files.
Unfortunately the two keys above won't detect when GDB and gdbserver are
running on the same machine but in different containers. I suggest
adding a new key for that:
mountinfo - The hash of the contents of the target's
/proc/self/mountinfo file.
The hash algorithm could be either SHA1 or MD5, which have
implementations in libiberty.
Disclaimer: I'm no expert in Linux containers, so I wouldn't be
surprised if there were a better way to differentiate containers (or,
more specifically for this use case, filesystem namespaces).
Hi Andrew,
On Wed, 2023-08-16 at 16:55 +0100, Andrew Burgess wrote:
> diff --git a/gdbserver/server.cc b/gdbserver/server.cc
> index e749194e039..ae40e885e70 100644
> --- a/gdbserver/server.cc
> +++ b/gdbserver/server.cc
> @@ -51,6 +51,8 @@
> #include "gdbsupport/scoped_restore.h"
> #include "gdbsupport/search.h"
>
> +#include <systemd/sd-id128.h>
> +
> /* PBUFSIZ must also be at least as big as IPA_CMD_BUF_SIZE, because
> the client state data is passed directly to some agent
> functions. */
This include is only available when systemd-devel headers are
installed. But it seems not to be used, so can just be removed as far
as I can tell.
Cheers,
Mark
Mark Wielaard <mark@klomp.org> writes:
> Hi Andrew,
>
> On Wed, 2023-08-16 at 16:55 +0100, Andrew Burgess wrote:
>> diff --git a/gdbserver/server.cc b/gdbserver/server.cc
>> index e749194e039..ae40e885e70 100644
>> --- a/gdbserver/server.cc
>> +++ b/gdbserver/server.cc
>> @@ -51,6 +51,8 @@
>> #include "gdbsupport/scoped_restore.h"
>> #include "gdbsupport/search.h"
>>
>> +#include <systemd/sd-id128.h>
>> +
>> /* PBUFSIZ must also be at least as big as IPA_CMD_BUF_SIZE, because
>> the client state data is passed directly to some agent
>> functions. */
>
> This include is only available when systemd-devel headers are
> installed. But it seems not to be used, so can just be removed as far
> as I can tell.
Thanks for spotting that. An earlier version of this patch made use of
sd_id128_get_machine, but in the end I figured it was easier just to
read the file directly.
I've fixed this locally for now, the fix will be in V2.
Thanks,
Andrew
Eli Zaretskii <eliz@gnu.org> writes:
>> Cc: Andrew Burgess <aburgess@redhat.com>
>> Date: Wed, 16 Aug 2023 16:55:03 +0100
>> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
>>
>> diff --git a/gdb/NEWS b/gdb/NEWS
>> index 9839330c46d..d83f097d937 100644
>> --- a/gdb/NEWS
>> +++ b/gdb/NEWS
>> @@ -272,6 +272,11 @@ qDefaultExecAndArgs
>> which the server was started. If no such information was given to
>> the server then this is reflected in the reply.
>>
>> +qMachineId
>> + This packet returns an identifier that allows GDB to determine if
>> + the remote server and GDB are running on the same host, and can see
>> + the same filesystem.
>> +
>> *** Changes in GDB 13
>
> This part is OK.
>
>> @@ -44748,6 +44749,49 @@
>> Indicates an error was encountered.
>> @end table
>>
>> +@anchor{Machine-Id Packet}
>> +@item qMachineId
>> +@cindex query remote machine-id, remote request
>> +@cindex @samp{qMachineId} packet
>
> Please move the @cindex entries before the @item line, so that
> index-search in an Info reader lands you on the line corresponding to
> @item, not on the text after that.
>
>> +within a single reply. See @ref{Machine-Id Details} for details of
> ^
> Comma there, please (it is needed for some older versions of Texinfo).
Thanks for the feedback, I'll get that fixed.
> Btw, when a sentence starts with "See @ref", you could simplify by
> using @xref, as that is its main purpose.
When generating the 'info' docs @xref doesn't add the 'See' prefix, but
when generating the pdf 'See' is added. I guess this explains why 'See
@ref' is common throughout the GDB manual.
It seems strange the prefix is added in some contexts, but not in
others. The info docs don't seem to read as clearly without the 'See'
prefix, so for me at least, @xref seems the worse choice.
My question is, what am I missing here? I'm sure there's some logic
behind this difference, but I'm not seeing it.
Thanks,
Andrew
> From: Andrew Burgess <aburgess@redhat.com>
> Cc: gdb-patches@sourceware.org
> Date: Fri, 25 Aug 2023 15:49:00 +0100
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > Btw, when a sentence starts with "See @ref", you could simplify by
> > using @xref, as that is its main purpose.
>
> When generating the 'info' docs @xref doesn't add the 'See' prefix, but
> when generating the pdf 'See' is added. I guess this explains why 'See
> @ref' is common throughout the GDB manual.
No, @xref always produces "See". Where did you see it without "See"?
>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
Andrew> linux-boot-id - this returns the value from the file
Andrew> /proc/sys/kernel/random/boot_id, which, if I understand correctly,
Andrew> should be unique(ish) for each boot of each machine, and
At one point I thought I found docs saying this should be kept
confidential and in particular not sent over the network. I can't find
those any more but I did find them for /etc/machine-id.
When we discussed this on irc, Pedro had a different idea, based on
using the existing remote file operations: write a random number /
identifying string to a local file (say, something in /tmp or maybe
gdb's cache directory), then ask the remote to read it. If the read
succeeds and the result is identical, assume the machines are the same.
Tom
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Andrew Burgess <aburgess@redhat.com>
>> Cc: gdb-patches@sourceware.org
>> Date: Fri, 25 Aug 2023 15:49:00 +0100
>>
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>> > Btw, when a sentence starts with "See @ref", you could simplify by
>> > using @xref, as that is its main purpose.
>>
>> When generating the 'info' docs @xref doesn't add the 'See' prefix, but
>> when generating the pdf 'See' is added. I guess this explains why 'See
>> @ref' is common throughout the GDB manual.
>
> No, @xref always produces "See". Where did you see it without "See"?
Sorry Eli, I missed your reply here.
I am using GNU texinfo 6.7, and I don't see @xref adding "See" when
generating 'info' output format. When producing 'pdf' output I do see
the "See" prefix.
This is visible to me in the current GDB info page(s).
For example, this paragraph in gdb.texinfo:
If the architecture supports memory tagging, the @code{print} command will
display pointer/memory tag mismatches if what is being printed is a pointer
or reference type. @xref{Memory Tagging}.
Is rendered like this in 'info' format:
If the architecture supports memory tagging, the 'print' command will
display pointer/memory tag mismatches if what is being printed is a
pointer or reference type. *Note Memory Tagging::.
I guess it has the '*Note ' prefix instead, but in contrast, this
partial paragraph from gdb.texinfo:
This command allows to control the information printed when
the debugger prints a frame. See @ref{Frames}, @ref{Backtrace},
for a general explanation about frames and frame information.
Is rendered like this in 'info' format:
This command allows to control the information printed when the
debugger prints a frame. See *note Frames::, *note Backtrace::,
for a general explanation about frames and frame information.
Which I think is nicer, for me starting the sentence with 'Note' doesn't
seem as friendly.
Thanks,
Andrew
> From: Andrew Burgess <aburgess@redhat.com>
> Cc: gdb-patches@sourceware.org
> Date: Tue, 26 Sep 2023 15:42:26 +0100
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > No, @xref always produces "See". Where did you see it without "See"?
>
> Sorry Eli, I missed your reply here.
>
> I am using GNU texinfo 6.7, and I don't see @xref adding "See" when
> generating 'info' output format. When producing 'pdf' output I do see
> the "See" prefix.
>
> This is visible to me in the current GDB info page(s).
>
> For example, this paragraph in gdb.texinfo:
>
> If the architecture supports memory tagging, the @code{print} command will
> display pointer/memory tag mismatches if what is being printed is a pointer
> or reference type. @xref{Memory Tagging}.
>
> Is rendered like this in 'info' format:
>
> If the architecture supports memory tagging, the 'print' command will
> display pointer/memory tag mismatches if what is being printed is a
> pointer or reference type. *Note Memory Tagging::.
>
> I guess it has the '*Note ' prefix instead, but in contrast
Yes, in Info format, @xref produces a capitalized "Note". (Emacs
replaces that on display with "See", so I tend to forget about the
actual conversion, sorry about that). And a capitalize "Note" is also
inappropriate in the middle of a sentence.
> This command allows to control the information printed when
> the debugger prints a frame. See @ref{Frames}, @ref{Backtrace},
> for a general explanation about frames and frame information.
>
> Is rendered like this in 'info' format:
>
> This command allows to control the information printed when the
> debugger prints a frame. See *note Frames::, *note Backtrace::,
> for a general explanation about frames and frame information.
>
> Which I think is nicer, for me starting the sentence with 'Note' doesn't
> seem as friendly.
It's your personal preference, so I don't want to argue. It is
suboptimal from my POV, which is why you will almost never see that in
the Emacs manuals.
@@ -1171,6 +1171,7 @@ COMMON_SFILES = \
reggroups.c \
remote.c \
remote-fileio.c \
+ remote-machine-id.c \
remote-notif.c \
reverse.c \
run-on-main-thread.c \
@@ -1448,6 +1449,7 @@ HFILES_NO_SRCDIR = \
regset.h \
remote.h \
remote-fileio.h \
+ remote-machine-id.h \
remote-notif.h \
riscv-fbsd-tdep.h \
riscv-ravenscar-thread.h \
@@ -1568,6 +1570,7 @@ HFILES_NO_SRCDIR = \
nat/gdb_thread_db.h \
nat/fork-inferior.h \
nat/linux-btrace.h \
+ nat/linux-machine-id.h \
nat/linux-namespaces.h \
nat/linux-nat.h \
nat/linux-osdata.h \
@@ -272,6 +272,11 @@ qDefaultExecAndArgs
which the server was started. If no such information was given to
the server then this is reflected in the reply.
+qMachineId
+ This packet returns an identifier that allows GDB to determine if
+ the remote server and GDB are running on the same host, and can see
+ the same filesystem.
+
*** Changes in GDB 13
* MI version 1 is deprecated, and will be removed in GDB 14.
@@ -58,7 +58,7 @@ case ${gdb_host} in
proc-service.o \
linux-thread-db.o linux-nat.o nat/linux-osdata.o linux-fork.o \
nat/linux-procfs.o nat/linux-ptrace.o nat/linux-waitpid.o \
- nat/linux-personality.o nat/linux-namespaces.o'
+ nat/linux-personality.o nat/linux-machine-id.o nat/linux-namespaces.o'
NAT_CDEPS='$(srcdir)/proc-service.list'
LOADLIBES='-ldl $(RDYNAMIC)'
;;
@@ -41666,6 +41666,7 @@
* Traceframe Info Format::
* Branch Trace Format::
* Branch Trace Configuration Format::
+* Machine-Id Details::
@end menu
@node Overview
@@ -44748,6 +44749,49 @@
Indicates an error was encountered.
@end table
+@anchor{Machine-Id Packet}
+@item qMachineId
+@cindex query remote machine-id, remote request
+@cindex @samp{qMachineId} packet
+Access a remote @dfn{machine-id}. The machine-id returned in response
+to this packet is compared to a machine-id created on the host the
+debugger is running on, if the two machine-ids match then the debugger
+will assume that the remote server and the debugger are running on the
+same machine, and can access the same files, the debugger will use
+this knowledge to avoid unnecessary copying of files from the remote
+(@pxref{File-I/O Remote Protocol Extension}).
+
+Reply:
+@table @samp
+@item predicate;@var{key}=@var{value}@r{[};@var{key}=@var{value}@r{]*}
+Returning a string starting with @samp{predicate;}, followed by one or
+more @var{key}=@var{value} pairs, defines a machine-id. Each
+@var{key} and @var{value} is a non-empty string that must not contain
+the characters @samp{;} or @samp{=}. Each @var{key} must be unique
+within a single reply. See @ref{Machine-Id Details} for details of
+valid @var{key}s and their @var{value}s.
+
+@item remote
+Returning the string @samp{remote} indicates that the remote server
+should always be considered truly remote, and files the debugger needs
+to access should be first copied from the remote.
+
+@item local
+Returning the string @samp{local} indicates that the remote server
+should always be treated as running on the same host as the debugger.
+The debugger will avoid copying files from the remote server, and will
+instead try to access files directly.
+
+Sending this reply will rarely be appropriate, as it implies certainty
+about where the remote server and debugger are running, however, in
+some tightly controlled environments this might be appropriate. Using
+a @samp{predicate} based reply would be better if at all possible.
+
+@item E @var{NN}
+@itemx E.errtext
+Indicates an error was encountered.
+@end table
+
@item Qbtrace:bts
Enable branch tracing for the current thread using Branch Trace Store.
@@ -47455,6 +47499,62 @@
<!ATTLIST pt size CDATA #IMPLIED>
@end smallexample
+@node Machine-Id Details
+@section Machine-Id Details
+@cindex machine-id key-value pair details
+
+This section describes the valid @var{key}s and @var{values}s that can
+be returned in response to the @samp{qMachineId} packet
+(@pxref{Machine-Id Packet}), specifically, when using a
+@samp{predicate;} based reply. Other reply types for the
+@samp{qMachineId} packet don't include @var{key}s and @var{value}s.
+
+There are two types of @var{key}, master keys and secondary keys. A
+reply should contain exactly one master key, and zero or more
+secondary keys. The set of valid secondary keys will depend on which
+master key is used.
+
+No @var{key} of @var{value} can contain the characters @samp{;} or
+@samp{=}.
+
+The order of the @var{key}/@var{value} pairs in the reply does not
+matter.
+
+Currently supported master and secondary keys are described below:
+
+@table @samp
+@item linux-boot-id
+The value for this master key contains the contents of the first line
+of the file @file{/proc/sys/kernel/random/boot_id} with any @samp{-}
+characters filtered out.
+
+@table @samp
+@item cuserid
+The value for this secondary key contains a username string associated
+with the effective user ID of the remote server process.
+@end table
+
+An example reply using @samp{linux-boot-id} and all secondary keys is:
+
+@smallexample
+predicate;linux-boot-id=28d154b3b1518383b3b4efcbd221fa7d;cuserid=username
+@end smallexample
+
+@end table
+
+When matching a machine-id @value{GDBN} first checks the reply for a
+master key that it understands. If a suitable key is found
+@value{GDBN} checks that the value for the master key matches its
+value for the master key. If the master key value matches, then
+@value{GDBN} checks all the remaining @var{key}/@var{value} pairs;
+each @var{key} must be known secondary key associated with the
+previously matched master key, and the secondary @var{value} must
+match @value{GDBN}'s computed value.
+
+If all @var{key}s are known, and their @var{value}s match, then
+@value{GDBN} considers the machine-id a match, otherwise, the
+machine-id is considered non-matching.
+
@include agentexpr.texi
@node Target Descriptions
@@ -69,6 +69,8 @@
#include "gdbsupport/scope-exit.h"
#include "gdbsupport/gdb-sigmask.h"
#include "gdbsupport/common-debug.h"
+#include "remote-machine-id.h"
+#include "nat/linux-machine-id.h"
#include <unordered_map>
/* This comment documents high-level logic of this file.
@@ -4497,6 +4499,35 @@ current_lwp_ptid (void)
return inferior_ptid;
}
+struct linux_nat_machine_id_validation : public machine_id_validation
+{
+ linux_nat_machine_id_validation ()
+ : machine_id_validation ("linux-boot-id")
+ { /* Nothing. */ }
+
+ bool check_master_key (const std::string &value) override
+ {
+ std::string boot_id = gdb_linux_machine_id_linux_boot_id ();
+ if (boot_id.empty ())
+ return false;
+ return boot_id == value;
+ }
+
+ bool check_secondary_key (const std::string &key,
+ const std::string &value) override
+ {
+ if (key == "cuserid")
+ {
+ std::string username = gdb_linux_machine_cuserid ();
+ if (username.empty ())
+ return false;
+ return username == value;
+ }
+
+ return false;
+ }
+};
+
void _initialize_linux_nat ();
void
_initialize_linux_nat ()
@@ -4534,6 +4565,10 @@ Enables printf debugging output."),
sigemptyset (&blocked_mask);
lwp_lwpid_htab_create ();
+
+ std::unique_ptr<linux_nat_machine_id_validation> validation
+ (new linux_nat_machine_id_validation);
+ register_machine_id_validation (std::move (validation));
}
new file mode 100644
@@ -0,0 +1,60 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+ This file is part of GDB.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#include "gdbsupport/common-defs.h"
+
+#include "nat/linux-machine-id.h"
+#include "safe-ctype.h"
+
+#include <unistd.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+
+/* See nat/linux-machine-id.h. */
+
+std::string
+gdb_linux_machine_id_linux_boot_id ()
+{
+ int fd = open ("/proc/sys/kernel/random/boot_id", O_RDONLY);
+ if (fd < 0)
+ return "";
+
+ std::string boot_id;
+ char buf;
+ while (read (fd, &buf, sizeof (buf)) == sizeof (buf))
+ {
+ if (ISXDIGIT (buf))
+ boot_id += buf;
+ }
+
+ close (fd);
+
+ return boot_id;
+}
+
+/* See nat/linux-machine-id.h. */
+
+std::string
+gdb_linux_machine_cuserid ()
+{
+ char cuserid_str[L_cuserid];
+ char *res = cuserid (cuserid_str);
+ if (res == nullptr)
+ return "";
+
+ return std::string (cuserid_str);
+}
new file mode 100644
@@ -0,0 +1,44 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+ This file is part of GDB.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef NAT_LINUX_MACHINE_ID_H
+#define NAT_LINUX_MACHINE_ID_H
+
+#include <string>
+
+/* Return a string that contains the Linux boot-id, formatted for use in
+ the qMachineId packet. If anything goes wrong then an empty string is
+ returned, otherwise a non-empty string is returned.
+
+ This is used by gdbserver when sending the reply to a qMachineId packet,
+ and used by GDB to check the value returned in for a qMachineId
+ packet. */
+
+extern std::string gdb_linux_machine_id_linux_boot_id ();
+
+/* Return a string that contains the result of calling cuserid, that is, a
+ username associated with the effective user-id of the current process.
+ If anything goes wrong then an empty string is returned, otherwise a
+ non-empty string is returned.
+
+ This is used by gdbserver when sending the reply to a qMachineId packet,
+ and used by GDB to check the value returned in for a qMachineId
+ packet. */
+
+extern std::string gdb_linux_machine_cuserid ();
+
+#endif /* NAT_LINUX_MACHINE_ID_H */
new file mode 100644
@@ -0,0 +1,69 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+ This file is part of GDB.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#include "defs.h"
+#include "remote-machine-id.h"
+
+#include <string>
+#include <vector>
+
+/* List of all registered machine_id_validation objects. */
+static std::vector<std::unique_ptr<machine_id_validation>> validation_list;
+
+/* See remote-machine-id.h. */
+
+void
+register_machine_id_validation (std::unique_ptr<machine_id_validation> &&validation)
+{
+ validation_list.emplace_back (std::move (validation));
+}
+
+/* See remote-machine-id. */
+
+bool
+validate_machine_id (const std::unordered_map<std::string, std::string> &kv_pairs)
+{
+ for (const auto &validator : validation_list)
+ {
+ const auto kv_master = kv_pairs.find (validator->master_key ());
+ if (kv_master == kv_pairs.end ())
+ continue;
+
+ if (!validator->check_master_key (kv_master->second))
+ continue;
+
+ /* Check all the secondary keys in KV_PAIRS. */
+ bool match_failed = false;
+ for (const auto &kv : kv_pairs)
+ {
+ if (kv.first == validator->master_key ())
+ continue;
+
+ if (!validator->check_secondary_key (kv.first, kv.second))
+ {
+ match_failed = true;
+ break;
+ }
+ }
+
+ if (!match_failed)
+ return true;
+ }
+
+ /* None of the machine_id_validation objects matched KV_PAIRS. */
+ return false;
+}
new file mode 100644
@@ -0,0 +1,108 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+ This file is part of GDB.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef REMOTE_MACHINE_ID_H
+#define REMOTE_MACHINE_ID_H
+
+#include <memory>
+#include <unordered_map>
+
+/* A base class from which machine-id validation objects can be created.
+ A remote target can send GDB a machine-id, which can be used to check
+ if the remote target and GDB are running on the same machine, and have
+ a common view of the file-system. Knowing this allows GDB to optimise
+ some of its interactions with the remote target.
+
+ A machine-id consists of a set of key-value pairs, where both keys and
+ values are std::string objects. A machine-id has a single master key
+ and some number of secondary keys.
+
+ Within GDB the native target will register one or more of these objects
+ by calling register_machine_id_validation. When GDB receives a
+ machine-id from a remote-target each registered machine_id_validation
+ object will be checked in turn to see if it matches the machined-id.
+ If any machine_id_validation matches then this indicates that GDB and
+ the remote target are on the same machine. */
+struct machine_id_validation
+{
+ /* Constructor. MASTER_KEY is the name of the master key that this
+ machine_id_validation object validates for. */
+ machine_id_validation (std::string &&master_key)
+ : m_master_key (master_key)
+ { /* Nothing. */ }
+
+ /* Destructor. */
+ virtual ~machine_id_validation ()
+ { /* Nothing. */ }
+
+ /* Return a reference to the master key. */
+ const std::string &
+ master_key () const
+ {
+ return m_master_key;
+ }
+
+ /* VALUE is a string passed from the remote target corresponding to the
+ key for master_key(). If the remote target didn't pass a key
+ matching master_key() then this function should not be called.
+
+ Return true if VALUE matches the value calculated for the host on
+ which GDB is currently running. */
+ virtual bool
+ check_master_key (const std::string &value) = 0;
+
+ /* This function will only be called for a machine-id which contains a
+ key matching master_key(), and for which check_master_key() returned
+ true.
+
+ KEY and VALUE are a key-value pair passed from the remote target.
+ This function should return true if KEY is known, and VALUE matches
+ the value calculated for the host on which GDB is running. If KEY is
+ not known, or VALUE doesn't match, then this function should return
+ false. */
+ virtual bool
+ check_secondary_key (const std::string &key, const std::string &value) = 0;
+
+private:
+ /* The master key for which this object validates machine-ids. */
+ std::string m_master_key;
+};
+
+/* Register a new machine-id. */
+
+extern void register_machine_id_validation
+ (std::unique_ptr<machine_id_validation> &&validation);
+
+/* KV_PAIRS contains all machine-id obtained from the remote target, the
+ keys are the index into the map, and the values are the values of the
+ map. These pairs are checked against all of the registered
+ machine_id_validation objects.
+
+ If any machine_id_validation matches all the data in KV_PAIRS then this
+ function returns true, otherwise, this function returns false.
+
+ For KV_PAIRS to match against a machine_id_validation object, KV_PAIRS
+ must contain a key matching machine_id_validation::master_key(), and the
+ value for that key must return true when passed to the function
+ machine_id_validation::check_master_key(). Then, for every other
+ key/value pair machine_id_validation::check_secondary_key() must return
+ true. */
+
+extern bool validate_machine_id
+ (const std::unordered_map<std::string, std::string> &kv_pairs);
+
+#endif /* REMOTE_MACHINE_ID_H */
@@ -80,6 +80,7 @@
#include "async-event.h"
#include "gdbsupport/selftest.h"
#include "cli/cli-style.h"
+#include "remote-machine-id.h"
/* The remote target. */
@@ -306,6 +307,9 @@ enum {
/* Support the qDefaultExecAndArgs packet. */
PACKET_qDefaultExecAndArgs,
+ /* Support the qMachineId packet. */
+ PACKET_qMachineId,
+
PACKET_MAX
};
@@ -557,6 +561,15 @@ class remote_state
this can go away. */
int wait_forever_enabled_p = 1;
+ /* Set to true if the remote target returned a machine-id (see
+ qMachineId packet) which matched one of the registered validation
+ objects. This indicates that the remote target is running on the
+ same host as GDB (and can see the same filesystem as GDB.
+
+ Otherwise, this is false, which indicates the remote target should be
+ treated as truly remote. */
+ bool remote_target_is_local_p = false;
+
private:
/* Mapping of remote protocol data for each gdbarch. Usually there
is only one entry here, though we may see more with stubs that
@@ -1341,6 +1354,12 @@ class remote_target : public process_stratum_target
/* Fetch the executable filename and argument string from the remote. */
remote_exec_and_args_info fetch_default_executable_and_arguments ();
+ /* Send the qMachineId packet and process the reply. Update the
+ remote_state::remote_target_is_local_p field based on the result. We
+ assume that when this is called remote_target_is_local_p will be
+ false by default. */
+ void fetch_remote_machine_id ();
+
bool start_remote_1 (int from_tty, int extended_p);
/* The remote state. Don't reference this directly. Use the
@@ -5055,6 +5074,151 @@ struct scoped_mark_target_starting
scoped_restore_tmpl<bool> m_restore_starting_up;
};
+/* Extract a machine-id key/value pair from the null-terminated string
+ **STRP, and update STRP to point to the first character after the parsed
+ key/value pair, including skipping any ';' that appears after the
+ key/value pair.
+
+ A key/value pair consists of two strings separated by an '=' character,
+ neither string will contain a '=' or ';' character.
+
+ Characters are read from *STRP until '=', ';' or the null character are
+ found, this forms the key string. If ';' or null character were found
+ then the value string is empty. Otherwise, '=' was found, the '=' is
+ skipped, and character are read until ';' or the null character are
+ found, this forms the value string.
+
+ This function will throw an error if the key string is found to be zero
+ length (e.g. '=abc' is invalid), or if the value string contains a '='
+ character (e.g. 'foo=def=ghi' is invalid).
+
+ The pair <key, value> is then returned. */
+
+static
+std::pair<std::string, std::string> extract_kv_pair (const char **strp)
+{
+ gdb_assert (strp != nullptr);
+ gdb_assert (*strp != nullptr);
+ gdb_assert (**strp != '\0');
+
+ std::string key, value;
+ const char *str = *strp;
+ while (*str != '=' && *str != ';' && *str != '\0')
+ {
+ key += *str;
+ ++str;
+ }
+
+ if (key.empty ())
+ error (_("empty key while parsing '%s'"), *strp);
+
+ if (*str == '\0' || *str == ';')
+ {
+ if (*str == ';')
+ ++str;
+ *strp = str;
+ return { key, "" };
+ }
+
+ gdb_assert (*str == '=');
+ ++str;
+
+ while (*str != ';' && *str != '\0')
+ {
+ if (*str == '=')
+ error (_("found '=' character in value string while parsing '%s'"),
+ *strp);
+ value += *str;
+ ++str;
+ }
+
+ if (*str == ';')
+ ++str;
+ *strp = str;
+ return { key, value };
+}
+
+/* See declaration in class above. */
+
+void
+remote_target::fetch_remote_machine_id ()
+{
+ struct remote_state *rs = get_remote_state ();
+
+ /* This should only be called for newly created remote_target objects, so
+ the remote_state::remote_target_is_local_p within the remote_target
+ should be false by default. */
+ gdb_assert (!rs->remote_target_is_local_p);
+
+ if (m_features.packet_support (PACKET_qMachineId) == PACKET_DISABLE)
+ return;
+
+ putpkt ("qMachineId");
+ getpkt (&rs->buf, 0);
+
+ auto packet_result = m_features.packet_ok (rs->buf, PACKET_qMachineId);
+ if (packet_result == PACKET_UNKNOWN)
+ return;
+
+ if (packet_result == PACKET_ERROR)
+ {
+ warning (_("Remote error: %s"), rs->buf.data ());
+ return;
+ }
+
+ /* If the machine-id is the string 'remote' then we are done. The
+ remote_target_is_local_p field is false by default. */
+ const char *id = rs->buf.data ();
+ if (startswith (id, "remote") && (id[6] == ';' || id[6] == '\0'))
+ return;
+
+ /* If the machine-id is the string 'local' then the remote claims to
+ "know" that it is on the same machine as GDB. Good luck with that. */
+ if (startswith (id, "local") && (id[5] == ';' || id[5] == '\0'))
+ {
+ rs->remote_target_is_local_p = true;
+ return;
+ }
+
+ /* If the machine-id starts with the string 'predicate;', then
+ everything after that string is the part of the machine-id that we
+ need to match against to confirm we are on the same machine as the
+ remote target. */
+ static const char *predicate_prefix = "predicate;";
+ if (!startswith (id, predicate_prefix))
+ return;
+ id += strlen (predicate_prefix);
+
+ /* Split the ID string into key/value pairs. */
+ std::unordered_map<std::string, std::string> kv;
+ try
+ {
+ while (*id != '\0')
+ {
+ auto kv_pair = extract_kv_pair (&id);
+ kv.emplace (std::move (kv_pair.first), std::move (kv_pair.second));
+ }
+ }
+ catch (const gdb_exception &ex)
+ {
+ /* Let the user know something went wrong, and then return, treating
+ the target as truly remote. */
+ warning (_("Error parsing qMachineId packet: %s"), ex.what ());
+ return;
+ }
+
+ /* If there were no predicates, then this looks like a badly behaved
+ remote target, warn the user, and assume the target is remote. */
+ if (kv.empty ())
+ {
+ warning (_("no machine-id predicates in qMachineId packet reply"));
+ return;
+ }
+
+ /* Check to see if the remote machine is actually local. */
+ rs->remote_target_is_local_p = validate_machine_id (kv);
+}
+
/* See declaration in class above. */
remote_exec_and_args_info
@@ -5188,6 +5352,8 @@ remote_target::start_remote_1 (int from_tty, int extended_p)
rs->noack_mode = 1;
}
+ fetch_remote_machine_id ();
+
auto exec_and_args = fetch_default_executable_and_arguments ();
/* Update the inferior with the executable and argument string from the
@@ -15636,6 +15802,9 @@ Show the maximum size of the address (in bits) in a memory packet."), NULL,
add_packet_config_cmd (PACKET_qDefaultExecAndArgs, "qDefaultExecAndArgs",
"fetch-exec-and-args", 0);
+ add_packet_config_cmd (PACKET_qMachineId, "qMachineId",
+ "fetch-machine-id", 0);
+
/* Assert that we've registered "set remote foo-packet" commands
for all packet configs. */
{
@@ -220,6 +220,7 @@ SFILES = \
$(srcdir)/../gdb/nat/aarch64-mte-linux-ptrace.c \
$(srcdir)/../gdb/nat/aarch64-sve-linux-ptrace.c \
$(srcdir)/../gdb/nat/linux-btrace.c \
+ $(srcdir)/../gdb/nat/linux-machine-id.c \
$(srcdir)/../gdb/nat/linux-namespaces.c \
$(srcdir)/../gdb/nat/linux-osdata.c \
$(srcdir)/../gdb/nat/linux-personality.c \
@@ -26,7 +26,7 @@ ipa_ppc_linux_regobj="powerpc-32l-ipa.o powerpc-altivec32l-ipa.o powerpc-vsx32l-
# Linux object files. This is so we don't have to repeat
# these files over and over again.
-srv_linux_obj="linux-low.o nat/linux-osdata.o nat/linux-procfs.o nat/linux-ptrace.o nat/linux-waitpid.o nat/linux-personality.o nat/linux-namespaces.o fork-child.o nat/fork-inferior.o"
+srv_linux_obj="linux-low.o nat/linux-osdata.o nat/linux-procfs.o nat/linux-ptrace.o nat/linux-waitpid.o nat/linux-personality.o nat/linux-machine-id.o nat/linux-namespaces.o fork-child.o nat/fork-inferior.o"
# Input is taken from the "${host}" and "${target}" variables.
@@ -61,6 +61,7 @@
#include <elf.h>
#endif
#include "nat/linux-namespaces.h"
+#include "nat/linux-machine-id.h"
#ifndef O_LARGEFILE
#define O_LARGEFILE 0
@@ -6940,6 +6941,24 @@ linux_process_target::thread_pending_child (thread_info *thread)
return get_lwp_thread (child);
}
+/* See target.h. */
+
+std::string
+linux_process_target::get_machine_id () const
+{
+ std::string boot_id = gdb_linux_machine_id_linux_boot_id ();
+ if (boot_id.empty ())
+ return "";
+ boot_id = "linux-boot-id=" + boot_id;
+
+ std::string username = gdb_linux_machine_cuserid ();
+ if (username.empty ())
+ return "";
+ username = "cuserid=" + username;
+
+ return boot_id + ";" + username;
+}
+
/* Default implementation of linux_target_ops method "set_pc" for
32-bit pc register which is literally named "pc". */
@@ -317,6 +317,8 @@ class linux_process_target : public process_stratum_target
bool supports_catch_syscall () override;
+ std::string get_machine_id () const override;
+
/* Return the information to access registers. This has public
visibility because proc-service uses it. */
virtual const regs_info *get_regs_info () = 0;
@@ -51,6 +51,8 @@
#include "gdbsupport/scoped_restore.h"
#include "gdbsupport/search.h"
+#include <systemd/sd-id128.h>
+
/* PBUFSIZ must also be at least as big as IPA_CMD_BUF_SIZE, because
the client state data is passed directly to some agent
functions. */
@@ -2730,6 +2732,18 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
return;
}
+ if (strcmp ("qMachineId", own_buf) == 0)
+ {
+ std::string machine_id = the_target->get_machine_id ();
+ if (!machine_id.empty ())
+ machine_id = std::string ("predicate;") + machine_id;
+ else
+ machine_id = std::string ("remote");
+
+ strcpy (own_buf, machine_id.c_str ());
+ return;
+ }
+
/* Otherwise we didn't know what packet it was. Say we didn't
understand it. */
own_buf[0] = 0;
@@ -442,6 +442,14 @@ process_stratum_target::store_memtags (CORE_ADDR address, size_t len,
gdb_assert_not_reached ("target op store_memtags not supported");
}
+/* See target.h. */
+
+std::string
+process_stratum_target::get_machine_id () const
+{
+ return "";
+}
+
int
process_stratum_target::read_offsets (CORE_ADDR *text, CORE_ADDR *data)
{
@@ -508,6 +508,15 @@ class process_stratum_target
Returns true if successful and false otherwise. */
virtual bool store_memtags (CORE_ADDR address, size_t len,
const gdb::byte_vector &tags, int type);
+
+ /* Return a string representing a machine-id suitable for returning
+ within a qMachineId packet response, but don't include the
+ 'predicate;' prefix.
+
+ If the current target doesn't support machine-id, or if we fail to
+ build the machine-id for any reason, then return an empty string, the
+ server will send back a suitable reply to the debugger. */
+ virtual std::string get_machine_id () const;
};
extern process_stratum_target *the_target;