[1/2] ld: Support percent-encoded JSON in --package-metadata

Message ID 20241111100207.219703-1-benjamin.drung@canonical.com
State New
Headers
Series [1/2] ld: Support percent-encoded JSON in --package-metadata |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 success Test passed
linaro-tcwg-bot/tcwg_binutils_check--master-arm success Test passed

Commit Message

Benjamin Drung Nov. 11, 2024, 10:02 a.m. UTC
  Specifying the compiler flag `-Wl,--package-metadata=<JSON>` will not
work in case the JSON contains a comma, because compiler drivers eat
commas. Example:

```
$ echo "void main() { }" > test.c
$ gcc '-Wl,--package-metadata={"type":"deb","os":"ubuntu"}' test.c
/usr/bin/ld: cannot find "os":"ubuntu"}: No such file or directory
collect2: error: ld returned 1 exit status
```

The quotation marks in the JSON value do not work well with shell nor
make. Specifying the `--package-metadata` linker flag in a `LDFLAGS`
environment variable might loose its quotation marks when it hits the
final compiler call.

So support percent-encoded and %[string] encoded JSON data in the
`--package-metadata` linker flag. Percent-encoding is used because it is
a standard, simple to implement, and does take too many additional
characters. %[string] encoding is supported for having a more readable
encoding.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=32003
Bug-Ubutru: https://bugs.launchpad.net/bugs/2071468
Signed-off-by: Benjamin Drung <benjamin.drung@canonical.com>
---
 ld/NEWS                              |  4 ++
 ld/emultempl/elf.em                  | 12 ++++-
 ld/ld.texi                           |  5 ++
 ld/ldmisc.c                          | 78 ++++++++++++++++++++++++++++
 ld/ldmisc.h                          |  1 +
 ld/testsuite/ld-elf/package-note.exp | 36 +++++++++++++
 ld/testsuite/ld-elf/package-note2.rd |  6 +++
 7 files changed, 140 insertions(+), 2 deletions(-)
 create mode 100644 ld/testsuite/ld-elf/package-note2.rd
  

Comments

Jan Beulich Nov. 12, 2024, 1:35 p.m. UTC | #1
On 11.11.2024 11:02, Benjamin Drung wrote:
> --- a/ld/NEWS
> +++ b/ld/NEWS
> @@ -8,6 +8,10 @@ Changes in 2.44:
>  * Add a "--build-id=xx" option, if built with the xxhash library.  This
>    produces a 128-bit hash, 2-4x faster than md5 or sha1.
>  
> +* The ELF linker option --package-metadata option supports percent-encoded

Nit: One too many "option" I think. Apart from this okay to put in as long
as within two weeks no other comments / objections show up.

Jan
  
Benjamin Drung Nov. 12, 2024, 3:28 p.m. UTC | #2
On Tue, 2024-11-12 at 14:35 +0100, Jan Beulich wrote:
> On 11.11.2024 11:02, Benjamin Drung wrote:
> > --- a/ld/NEWS
> > +++ b/ld/NEWS
> > @@ -8,6 +8,10 @@ Changes in 2.44:
> >  * Add a "--build-id=xx" option, if built with the xxhash library.  This
> >    produces a 128-bit hash, 2-4x faster than md5 or sha1.
> >  
> > +* The ELF linker option --package-metadata option supports percent-encoded
> 
> Nit: One too many "option" I think. Apart from this okay to put in as long
> as within two weeks no other comments / objections show up.

The second "option" should be removed. Should I resubmit with that typo
removed or will you adjust the patch when merging?
  
Jan Beulich Nov. 12, 2024, 3:43 p.m. UTC | #3
On 12.11.2024 16:28, Benjamin Drung wrote:
> On Tue, 2024-11-12 at 14:35 +0100, Jan Beulich wrote:
>> On 11.11.2024 11:02, Benjamin Drung wrote:
>>> --- a/ld/NEWS
>>> +++ b/ld/NEWS
>>> @@ -8,6 +8,10 @@ Changes in 2.44:
>>>  * Add a "--build-id=xx" option, if built with the xxhash library.  This
>>>    produces a 128-bit hash, 2-4x faster than md5 or sha1.
>>>  
>>> +* The ELF linker option --package-metadata option supports percent-encoded
>>
>> Nit: One too many "option" I think. Apart from this okay to put in as long
>> as within two weeks no other comments / objections show up.
> 
> The second "option" should be removed. Should I resubmit with that typo
> removed or will you adjust the patch when merging?

No need to re-submit just for that; I wasn't aware though that I ought to
(eventually) merge it.

Jan
  
Benjamin Drung Nov. 18, 2024, 10:41 a.m. UTC | #4
On Tue, 2024-11-12 at 16:43 +0100, Jan Beulich wrote:
> On 12.11.2024 16:28, Benjamin Drung wrote:
> > On Tue, 2024-11-12 at 14:35 +0100, Jan Beulich wrote:
> > > On 11.11.2024 11:02, Benjamin Drung wrote:
> > > > --- a/ld/NEWS
> > > > +++ b/ld/NEWS
> > > > @@ -8,6 +8,10 @@ Changes in 2.44:
> > > >  * Add a "--build-id=xx" option, if built with the xxhash library.  This
> > > >    produces a 128-bit hash, 2-4x faster than md5 or sha1.
> > > >  
> > > > +* The ELF linker option --package-metadata option supports percent-encoded
> > > 
> > > Nit: One too many "option" I think. Apart from this okay to put in as long
> > > as within two weeks no other comments / objections show up.
> > 
> > The second "option" should be removed. Should I resubmit with that typo
> > removed or will you adjust the patch when merging?
> 
> No need to re-submit just for that; I wasn't aware though that I ought to
> (eventually) merge it.

I am a drive-by contributor and do not know much about the project. The
"you" was a generic one and I meant the one merging it.
  

Patch

diff --git a/ld/NEWS b/ld/NEWS
index f7de85bd3ce..4403390a6c4 100644
--- a/ld/NEWS
+++ b/ld/NEWS
@@ -8,6 +8,10 @@  Changes in 2.44:
 * Add a "--build-id=xx" option, if built with the xxhash library.  This
   produces a 128-bit hash, 2-4x faster than md5 or sha1.
 
+* The ELF linker option --package-metadata option supports percent-encoded
+  and %[string] encoded JSON payload now to ease passing around this option
+  without getting the JSON payload corrupted.
+
 Changes in 2.43:
 
 * Add support for LoongArch DT_RELR (compressed R_LARCH_RELATIVE).
diff --git a/ld/emultempl/elf.em b/ld/emultempl/elf.em
index 2e865728587..dae610e07a2 100644
--- a/ld/emultempl/elf.em
+++ b/ld/emultempl/elf.em
@@ -867,8 +867,16 @@  gld${EMULATION_NAME}_handle_option (int optc)
     case OPTION_PACKAGE_METADATA:
       free ((char *) ldelf_emit_note_fdo_package_metadata);
       ldelf_emit_note_fdo_package_metadata = NULL;
-      if (optarg != NULL && strlen(optarg) > 0)
-	ldelf_emit_note_fdo_package_metadata = xstrdup (optarg);
+      if (optarg != NULL)
+	{
+	  size_t len = strlen (optarg);
+	  if (len > 0)
+	    {
+	      char *package_metadata = xmalloc (len + 1);
+	      percent_decode (optarg, package_metadata);
+	      ldelf_emit_note_fdo_package_metadata = package_metadata;
+	    }
+	}
       break;
 
     case OPTION_COMPRESS_DEBUG:
diff --git a/ld/ld.texi b/ld/ld.texi
index c66ac5de4b2..417af3e948d 100644
--- a/ld/ld.texi
+++ b/ld/ld.texi
@@ -3253,6 +3253,11 @@  Request the creation of a @code{.note.package} ELF note section.  The
 contents of the note are in JSON format, as per the package metadata
 specification.  For more information see:
 https://systemd.io/ELF_PACKAGE_METADATA/
+The JSON argument support percent-encoding and following %[string]
+(where string refers to the name in HTML's Named Character References)
+encoding: @samp{%[comma]} for @samp{,}, @samp{%[lbrace]} for @samp{@{},
+@samp{%[quot]} for @samp{"}, @samp{%[rbrace]} for @samp{@}}, and
+@samp{%[space]} for space character.
 If the JSON argument is missing/empty then this will disable the
 creation of the metadata note, if one had been enabled by an earlier
 occurrence of the --package-metadata option.
diff --git a/ld/ldmisc.c b/ld/ldmisc.c
index 180b24b3448..26c2763feb8 100644
--- a/ld/ldmisc.c
+++ b/ld/ldmisc.c
@@ -770,3 +770,81 @@  ld_abort (const char *file, int line, const char *fn)
   einfo (_("%F%P: please report this bug\n"));
   xexit (1);
 }
+
+/* Decode a hexadecimal character. Return -1 on error. */
+static int
+hexdecode (char c)
+{
+  if ('0' <= c && c <= '9')
+    return c - '0';
+  if ('A' <= c && c <= 'F')
+    return c - 'A' + 10;
+  if ('a' <= c && c <= 'f')
+    return c - 'a' + 10;
+  return -1;
+}
+
+/* Decode a percent and/or %[string] encoded string. dst must be at least
+   the same size as src. It can be converted in place.
+
+   Following %[string] encodings are supported:
+
+   %[comma] for ,
+   %[lbrace] for {
+   %[quot] for "
+   %[rbrace] for }
+   %[space] for ' '
+
+   The percent decoding behaves the same as Python's urllib.parse.unquote. */
+void
+percent_decode (const char *src, char *dst)
+{
+  while (*src != '\0')
+    {
+      char c = *src++;
+      if (c == '%')
+	{
+	  char next1 = *src;
+	  int hex1 = hexdecode (next1);
+	  if (hex1 != -1)
+	    {
+	      int hex2 = hexdecode (*(src + 1));
+	      if (hex2 != -1)
+		{
+		  c = (char) ((hex1 << 4) + hex2);
+		  src += 2;
+		}
+	    }
+	  else if (next1 == '[')
+	    {
+	      if (strncmp (src + 1, "comma]", 6) == 0)
+		{
+		  c = ',';
+		  src += 7;
+		}
+	      else if (strncmp (src + 1, "lbrace]", 7) == 0)
+		{
+		  c = '{';
+		  src += 8;
+		}
+	      else if (strncmp (src + 1, "quot]", 5) == 0)
+		{
+		  c = '"';
+		  src += 6;
+		}
+	      else if (strncmp (src + 1, "rbrace]", 7) == 0)
+		{
+		  c = '}';
+		  src += 8;
+		}
+	      else if (strncmp (src + 1, "space]", 6) == 0)
+		{
+		  c = ' ';
+		  src += 7;
+		}
+	    }
+	}
+      *dst++ = c;
+    }
+  *dst = '\0';
+}
diff --git a/ld/ldmisc.h b/ld/ldmisc.h
index 20289127c0a..ff73d799abe 100644
--- a/ld/ldmisc.h
+++ b/ld/ldmisc.h
@@ -39,5 +39,6 @@  do { info_assert(__FILE__,__LINE__); } while (0)
 extern void print_spaces (int);
 #define print_space() print_spaces (1)
 extern void print_nl (void);
+extern void percent_decode (const char *, char *);
 
 #endif
diff --git a/ld/testsuite/ld-elf/package-note.exp b/ld/testsuite/ld-elf/package-note.exp
index 9f18705bd65..a7f47776880 100644
--- a/ld/testsuite/ld-elf/package-note.exp
+++ b/ld/testsuite/ld-elf/package-note.exp
@@ -42,4 +42,40 @@  run_ld_link_tests [list \
 	{{readelf {--notes} package-note.rd}} \
 	"package-note.o" \
     ] \
+    [list \
+	"package-note1b.o" \
+	"--package-metadata=%7B%22foo%22%3A%22bar%22%7D" \
+	"" \
+	"" \
+	{start.s} \
+	{{readelf {--notes} package-note.rd}} \
+	"package-note1b.o" \
+    ] \
+    [list \
+	"package-note1c.o" \
+	"--package-metadata=%\[lbrace\]%\[quot\]foo%\[quot\]:%\[quot\]bar%\[quot\]%\[rbrace\]" \
+	"" \
+	"" \
+	{start.s} \
+	{{readelf {--notes} package-note.rd}} \
+	"package-note1c.o" \
+    ] \
+    [list \
+	"package-note2.o" \
+	"--package-metadata=%7B%22name%22:%22binutils%22%2C%22ver%22%3A%22x%20%%22%7d" \
+	"" \
+	"" \
+	{start.s} \
+	{{readelf {--notes} package-note2.rd}} \
+	"package-note2.o" \
+    ] \
+    [list \
+	"package-note2b.o" \
+	"--package-metadata={%\[quot\]name%\[quot\]:%\[quot\]binutils%\[quot\]%\[comma\]%\[quot\]ver%\[quot\]:%\[quot\]x%\[space\]%%\[quot\]}" \
+	"" \
+	"" \
+	{start.s} \
+	{{readelf {--notes} package-note2.rd}} \
+	"package-note2b.o" \
+    ] \
 ]
diff --git a/ld/testsuite/ld-elf/package-note2.rd b/ld/testsuite/ld-elf/package-note2.rd
new file mode 100644
index 00000000000..91c0f6d0fb0
--- /dev/null
+++ b/ld/testsuite/ld-elf/package-note2.rd
@@ -0,0 +1,6 @@ 
+#...
+Displaying notes found in: \.note\.package
+\s+Owner\s+Data\s+size\s+Description
+\s+FDO\s+0x00000020\s+(Unknown note type:\s+\(0xcafe1a7e\)|FDO_PACKAGING_METADATA)
+\s+(description data:\s+.*|Packaging Metadata:\s+{"name":"binutils","ver":"x %"})
+#pass