Message ID | 02ea8f50-7b68-6c4e-75db-e919121e8707@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <gdb-patches-bounces+patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B09D33857426 for <patchwork@sourceware.org>; Wed, 28 Sep 2022 12:59:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B09D33857426 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1664369996; bh=oDBsbILdF99VcIcvUzEolXUSaESI7yFv2t6eU4CjWtE=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=J3hteGJFKUhxBDPNbRAF8jQZLvZchxmim6tG93Z4/OMvqHL4ZJNcJ3+2Lysn9xWfD gaZStHAVIZTNZF7T9HgiezKi/RqXD0PpdozWUH1mUzBTtQYphwq9gLhTqKijWzKB19 ASlep8l4oKMyA8Hl1j4RxT9sITzjFOJVIf3Ue0eg= X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 144DF3858C83 for <gdb-patches@sourceware.org>; Wed, 28 Sep 2022 12:59:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 144DF3858C83 Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-86-Bwg2WJKfOISlXZ6XabLTUg-1; Wed, 28 Sep 2022 08:59:31 -0400 X-MC-Unique: Bwg2WJKfOISlXZ6XabLTUg-1 Received: by mail-wm1-f70.google.com with SMTP id y20-20020a05600c365400b003b4d4ae666fso534489wmq.4 for <gdb-patches@sourceware.org>; Wed, 28 Sep 2022 05:59:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:subject:from:cc:to:content-language :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date; bh=oDBsbILdF99VcIcvUzEolXUSaESI7yFv2t6eU4CjWtE=; b=fQGPFltpqstrrTuVleL8pV2b5v+zdn5YXBQFgMCvqYfF97FBC8+kvrRlMIZ33Ss6ME /vbNGkWJ1T7CD8M7jQcoPVMEKleTnvquf0DwD2mmGpnONeukoHZJiE7QqkmUmRsfBxm9 I/HXVAPG8n8bcNan+SB/LeGqQjZAkA4tGIGfneTJMp+sjdJw10Nq3p+Msgo51RwBFQR9 PG9kBObeSD/b2OtVWoAS1tRsdtZ9BoRyF2WYK5h82aMvMGQiES2mXKL/pLFg3lUAE04O fu90/7NZcT2NsBNjWaZKxrfRMVEIkghIhCQGtp7WJSzzyVrfuheOMMnX373QzFLYhxv/ isQg== X-Gm-Message-State: ACrzQf20D/oxWMtTC0+JrVH28fS25TPW45AZEeFt3Y83QOW35XC0+rDc De9cWcOq8aiyZy0yLBb9rXNzxQlJSOxmpu4u0Q9zTlBTATchhRlZ6kRz7SknrOYDABBNyBteqNE OtkNucAs2Y0ahcwSKH4j2nw== X-Received: by 2002:a5d:4887:0:b0:226:ed34:7bbd with SMTP id g7-20020a5d4887000000b00226ed347bbdmr19787936wrq.561.1664369970066; Wed, 28 Sep 2022 05:59:30 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6QTYyGM1yGrEXXTUVBxiyG2P++ttiE+wHaIvyXrxqtNaty1WVVAC8VW2k3YY7CEk7IPuewgw== X-Received: by 2002:a5d:4887:0:b0:226:ed34:7bbd with SMTP id g7-20020a5d4887000000b00226ed347bbdmr19787925wrq.561.1664369969857; Wed, 28 Sep 2022 05:59:29 -0700 (PDT) Received: from [192.168.1.18] ([212.126.151.172]) by smtp.gmail.com with ESMTPSA id a3-20020a05600c348300b003b3365b38f9sm1739319wmq.10.2022.09.28.05.59.27 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 28 Sep 2022 05:59:28 -0700 (PDT) Message-ID: <02ea8f50-7b68-6c4e-75db-e919121e8707@redhat.com> Date: Wed, 28 Sep 2022 13:59:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.0 To: Binutils <binutils@sourceware.org>, gdb-patches@sourceware.org Subject: RFC: Sort tarballs created by the src-release.sh script X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-GB Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>, <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>, <mailto:gdb-patches-request@sourceware.org?subject=subscribe> From: Nick Clifton via Gdb-patches <gdb-patches@sourceware.org> Reply-To: Nick Clifton <nickc@redhat.com> Cc: Tzvetelin Katchov <katchov@gnu.org> Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Sender: "Gdb-patches" <gdb-patches-bounces+patchwork=sourceware.org@sourceware.org> |
Series |
RFC: Sort tarballs created by the src-release.sh script
|
|
Commit Message
Nick Clifton
Sept. 28, 2022, 12:59 p.m. UTC
Hi Guys, A contributor recently pointed out that the binutils release tarballs are not sorted by name, making it hard to reproducibly recreate them. At first we thought that using tar's --sort=name option would solve this, but it turns out that the src-release.sh script creates its own list of input file names, so instead I am proposing this patch: Any comments or objections ? Cheers Nick
Comments
On Sep 28 2022, Nick Clifton via Gdb-patches wrote: > diff --git a/src-release.sh b/src-release.sh > index 079b545ae7c..caae5148034 100755 > --- a/src-release.sh > +++ b/src-release.sh > @@ -185,7 +185,7 @@ do_tar() > echo "==> Making $package-$ver.tar" > rm -f $package-$ver.tar > find $package-$ver -follow \( $CVS_NAMES \) -prune \ > - -o -type f -print \ > + -o -type f -print | sort \ Better set LC_ALL=C to be independent from the locale sorting.
Hi Andreas,
> Better set LC_ALL=C to be independent from the locale sorting.
Like this ?
diff --git a/src-release.sh b/src-release.sh
index 079b545ae7c..8c2a8d897fd 100755
--- a/src-release.sh
+++ b/src-release.sh
@@ -185,7 +185,7 @@ do_tar()
echo "==> Making $package-$ver.tar"
rm -f $package-$ver.tar
find $package-$ver -follow \( $CVS_NAMES \) -prune \
- -o -type f -print \
+ -o -type f -print | LC_ALL=C sort \
| tar cTfh - $package-$ver.tar
}
Cheers
Nick
PS. Would sort's --dictionary-order option have a similar effect ?
Hi Guys, Thinking about this a little more last night, it occurred to me that if we want reproducible tarballs then we should not be storing user names, group names or modification times either. So what do you think about this extended version of the patch: diff --git a/src-release.sh b/src-release.sh index 079b545ae7c..908492c28f7 100755 --- a/src-release.sh +++ b/src-release.sh @@ -185,8 +185,8 @@ do_tar() echo "==> Making $package-$ver.tar" rm -f $package-$ver.tar find $package-$ver -follow \( $CVS_NAMES \) -prune \ - -o -type f -print \ - | tar cTfh - $package-$ver.tar + -o -type f -print | LC_ALL=C sort \ + | tar cTfh - $package-$ver.tar --mtime=0 --group=0 --owner=0 } # Compress the output with bzip2 Cheers Nick
On 29.09.2022 14:24, Nick Clifton via Binutils wrote: > Thinking about this a little more last night, it occurred to me that if > we want reproducible tarballs then we should not be storing user names, > group names or modification times either. So what do you think about > this extended version of the patch: > > diff --git a/src-release.sh b/src-release.sh > index 079b545ae7c..908492c28f7 100755 > --- a/src-release.sh > +++ b/src-release.sh > @@ -185,8 +185,8 @@ do_tar() > echo "==> Making $package-$ver.tar" > rm -f $package-$ver.tar > find $package-$ver -follow \( $CVS_NAMES \) -prune \ > - -o -type f -print \ > - | tar cTfh - $package-$ver.tar > + -o -type f -print | LC_ALL=C sort \ > + | tar cTfh - $package-$ver.tar --mtime=0 --group=0 --owner=0 > } I wanted to indicate that an mtime of zero isn't the neatest, but the two tar versions I've tried this with said anyway "Treating date '0' as 2022-09-29 00:00:00". A non-zero date with a time of zero is fine with me, but won't make much of a difference in terms of reproducibility. Jan
Hi Guys, Right, here is the latest and greatest - and hopefully last - version of the patch. I added a parseable string to the --mtime option and a comment explaining why these options are being used. Any more comments/suggestions ? Cheers Nick diff --git a/src-release.sh b/src-release.sh index 079b545ae7c..8a2ac125030 100755 --- a/src-release.sh +++ b/src-release.sh @@ -184,9 +184,11 @@ do_tar() ver=$2 echo "==> Making $package-$ver.tar" rm -f $package-$ver.tar + # The sort command and --mtime, --group and --owner options are + # used in order to create consistent, reproducible tarballs. find $package-$ver -follow \( $CVS_NAMES \) -prune \ - -o -type f -print \ - | tar cTfh - $package-$ver.tar + -o -type f -print | LC_ALL=C sort \ + | tar cTfh - $package-$ver.tar --mtime="1970-01-01 00:00:00" --group=0 --owner=0 } # Compress the output with bzip2
> On 30 Sep 2022, at 12:38, Nick Clifton via Binutils <binutils@sourceware.org> wrote: > > Hi Guys, > > Right, here is the latest and greatest - and hopefully last - version > of the patch. I added a parseable string to the --mtime option and a > comment explaining why these options are being used. > > Any more comments/suggestions ? > > Cheers > Nick > > diff --git a/src-release.sh b/src-release.sh > index 079b545ae7c..8a2ac125030 100755 > --- a/src-release.sh > +++ b/src-release.sh > @@ -184,9 +184,11 @@ do_tar() > ver=$2 > echo "==> Making $package-$ver.tar" > rm -f $package-$ver.tar > + # The sort command and --mtime, --group and --owner options are > + # used in order to create consistent, reproducible tarballs. > find $package-$ver -follow \( $CVS_NAMES \) -prune \ > - -o -type f -print \ > - | tar cTfh - $package-$ver.tar > + -o -type f -print | LC_ALL=C sort \ > + | tar cTfh - $package-$ver.tar --mtime="1970-01-01 00:00:00" --group=0 --owner=0 > } > > # Compress the output with bzip2 > I think this might hit a problem I faced when trying to do this with Go tarballs: https://www.gnu.org/software/tar/manual/tar.html#warnings. With that date, I got "implausibly old time stamp" warnings from tar. I haven't tested this patchthough (writing from mobile, apologies). Maybe default to the creation date of Binutils and allow overriding via https://reproducible-builds.org/docs/source-date-epoch/? best, sam
On 02.10.2022 09:54, Sam James wrote: > > >> On 30 Sep 2022, at 12:38, Nick Clifton via Binutils <binutils@sourceware.org> wrote: >> >> Hi Guys, >> >> Right, here is the latest and greatest - and hopefully last - version >> of the patch. I added a parseable string to the --mtime option and a >> comment explaining why these options are being used. >> >> Any more comments/suggestions ? >> >> Cheers >> Nick >> >> diff --git a/src-release.sh b/src-release.sh >> index 079b545ae7c..8a2ac125030 100755 >> --- a/src-release.sh >> +++ b/src-release.sh >> @@ -184,9 +184,11 @@ do_tar() >> ver=$2 >> echo "==> Making $package-$ver.tar" >> rm -f $package-$ver.tar >> + # The sort command and --mtime, --group and --owner options are >> + # used in order to create consistent, reproducible tarballs. >> find $package-$ver -follow \( $CVS_NAMES \) -prune \ >> - -o -type f -print \ >> - | tar cTfh - $package-$ver.tar >> + -o -type f -print | LC_ALL=C sort \ >> + | tar cTfh - $package-$ver.tar --mtime="1970-01-01 00:00:00" --group=0 --owner=0 >> } >> >> # Compress the output with bzip2 >> > > I think this might hit a problem I faced when trying to do this with Go tarballs: https://www.gnu.org/software/tar/manual/tar.html#warnings. > > With that date, I got "implausibly old time stamp" warnings from tar. I haven't tested this patchthough (writing from mobile, apologies). > > Maybe default to the creation date of Binutils and allow overriding via https://reproducible-builds.org/docs/source-date-epoch/? Not sure what "creation date" might mean here. Assuming the script is (typically) run from a git tree, perhaps the commit date of the top level commit on the branch would be best to use? Jan
> On 3 Oct 2022, at 07:56, Jan Beulich <jbeulich@suse.com> wrote: > > On 02.10.2022 09:54, Sam James wrote: >> >> >>>> On 30 Sep 2022, at 12:38, Nick Clifton via Binutils <binutils@sourceware.org> wrote: >>> >>> Hi Guys, >>> >>> Right, here is the latest and greatest - and hopefully last - version >>> of the patch. I added a parseable string to the --mtime option and a >>> comment explaining why these options are being used. >>> >>> Any more comments/suggestions ? >>> >>> Cheers >>> Nick >>> >>> diff --git a/src-release.sh b/src-release.sh >>> index 079b545ae7c..8a2ac125030 100755 >>> --- a/src-release.sh >>> +++ b/src-release.sh >>> @@ -184,9 +184,11 @@ do_tar() >>> ver=$2 >>> echo "==> Making $package-$ver.tar" >>> rm -f $package-$ver.tar >>> + # The sort command and --mtime, --group and --owner options are >>> + # used in order to create consistent, reproducible tarballs. >>> find $package-$ver -follow \( $CVS_NAMES \) -prune \ >>> - -o -type f -print \ >>> - | tar cTfh - $package-$ver.tar >>> + -o -type f -print | LC_ALL=C sort \ >>> + | tar cTfh - $package-$ver.tar --mtime="1970-01-01 00:00:00" --group=0 --owner=0 >>> } >>> >>> # Compress the output with bzip2 >>> >> >> I think this might hit a problem I faced when trying to do this with Go tarballs: https://www.gnu.org/software/tar/manual/tar.html#warnings. >> >> With that date, I got "implausibly old time stamp" warnings from tar. I haven't tested this patchthough (writing from mobile, apologies). >> >> Maybe default to the creation date of Binutils and allow overriding via https://reproducible-builds.org/docs/source-date-epoch/? > > Not sure what "creation date" might mean here. I meant whatever date folks consider to have been the creation of the Binutils project. First commit, announcement, first release, whatever. Not that it really matters, it just can't be the unix epoch.
On Okt 03 2022, Sam James via Gdb-patches wrote:
> I meant whatever date folks consider to have been the creation of the Binutils project. First commit, announcement, first release, whatever.
I think using the date of the commit from which the tarball is being
created makes the most sense (this is what git archive does).
Another thing to consider is the recorded permission of the files.
On Okt 02 2022, Sam James via Gdb-patches wrote:
> With that date, I got "implausibly old time stamp" warnings from tar.
That happens because --mtime uses the local timezone. To get the true
Epoch you can use --mtime=@0.
Hi Guys,
[This appears to be getting slightly out of hand...]
> Not sure what "creation date" might mean here. Assuming the script is > (typically) run from a git tree, perhaps the commit date of the top> level commit on the branch would be best to use?
Except that a commit to the branch that does not affect something that
would go into the tarball would then result in a changed date.
We could use the src-release.sh file itself, like this:
diff --git a/src-release.sh b/src-release.sh
index 079b545ae7c..de1f98a70bb 100755
--- a/src-release.sh
+++ b/src-release.sh
@@ -184,9 +184,15 @@ do_tar()
ver=$2
echo "==> Making $package-$ver.tar"
rm -f $package-$ver.tar
+ # The sort command and --mtime, --group and --owner options are
+ # used in order to create consistent, reproducible tarballs.
+ # BUILD_DATE is set to SOURCE_DATE_EPOCH if defined, or the
+ # modification date of this file otherwise. cf:
+ # https://reproducible-builds.org/docs/source-date-epoch/
+ BUILD_DATE="$(date --utc --date="@${SOURCE_DATE_EPOCH:-$(date -r src-release.sh +%s)}" +%Y-%m-%d)"
find $package-$ver -follow \( $CVS_NAMES \) -prune \
- -o -type f -print \
- | tar cTfh - $package-$ver.tar
+ -o -type f -print | LC_ALL=C sort \
+ | tar cTfh - $package-$ver.tar --mtime=$BUILD_DATE --group=0 --owner=0
}
# Compress the output with bzip2
Would that work ?
Cheers
Nick
On Okt 03 2022, Nick Clifton wrote: > We could use the src-release.sh file itself, like this: The timestamp of checked out files has no meaning, and is generally not reproducible. > diff --git a/src-release.sh b/src-release.sh > index 079b545ae7c..de1f98a70bb 100755 > --- a/src-release.sh > +++ b/src-release.sh > @@ -184,9 +184,15 @@ do_tar() > ver=$2 > echo "==> Making $package-$ver.tar" > rm -f $package-$ver.tar > + # The sort command and --mtime, --group and --owner options are > + # used in order to create consistent, reproducible tarballs. > + # BUILD_DATE is set to SOURCE_DATE_EPOCH if defined, or the > + # modification date of this file otherwise. cf: > + # https://reproducible-builds.org/docs/source-date-epoch/ > + BUILD_DATE="$(date --utc --date="@${SOURCE_DATE_EPOCH:-$(date -r src-release.sh +%s)}" +%Y-%m-%d)" > find $package-$ver -follow \( $CVS_NAMES \) -prune \ > - -o -type f -print \ > - | tar cTfh - $package-$ver.tar > + -o -type f -print | LC_ALL=C sort \ > + | tar cTfh - $package-$ver.tar --mtime=$BUILD_DATE --group=0 --owner=0 That won't work, as --mtime=$BUILD_DATE is interpreted in the local time zone.
On 03.10.2022 16:40, Nick Clifton wrote: > Hi Guys, > > [This appears to be getting slightly out of hand...] It might seem like that, yes, but I guess that's not entirely unexpected for a topic of that kind - different people have different expectations and habits. >> Not sure what "creation date" might mean here. Assuming the script is > (typically) run from a git tree, perhaps the commit date of the top> level commit on the branch would be best to use? > Except that a commit to the branch that does not affect something that > would go into the tarball would then result in a changed date. Every commit should be considered to affect the tarball, imo, as such effects could also be indirect. If you really wanted to go that route, then perhaps an alternative would be to use the commit date of the most recent commit touching bfd/version.m4. Jan
Hi Guys, On 10/4/22 08:10, Jan Beulich wrote: > Every commit should be considered to affect the tarball, imo, as such > effects could also be indirect. If you really wanted to go that route, > then perhaps an alternative would be to use the commit date of the > most recent commit touching bfd/version.m4. Hmm, except that would probably only be appropriate for binutils tarballs, not others. So how about the attached patch ? This one adds a new command line option to src-release.sh. If it is not used then the behaviour is not changed in any way. If the new option is used, it provides a date that is passed to tar's --mtime option, along with triggering the use of sort and the other tar options necessary to make a reproducible tarball. So: src-release.sh -x -r `git log -1 --format=%cd --date=format:%F bfd/version.m4` binutils should create a pretty consistent tarball. Cheers Nick
On 05.10.2022 14:23, Nick Clifton wrote: > Hi Guys, > > On 10/4/22 08:10, Jan Beulich wrote: >> Every commit should be considered to affect the tarball, imo, as such >> effects could also be indirect. If you really wanted to go that route, >> then perhaps an alternative would be to use the commit date of the >> most recent commit touching bfd/version.m4. > > Hmm, except that would probably only be appropriate for binutils tarballs, > not others. > > So how about the attached patch ? This one adds a new command line option to > src-release.sh. If it is not used then the behaviour is not changed in any > way. If the new option is used, it provides a date that is passed to tar's > --mtime option, along with triggering the use of sort and the other tar > options necessary to make a reproducible tarball. So: > > src-release.sh -x -r `git log -1 --format=%cd --date=format:%F bfd/version.m4` binutils > > should create a pretty consistent tarball. Lgtm, fwiw. Just one nit: You may want to add the missing 'b' for "tarball" in the new help text line. Jan
diff --git a/src-release.sh b/src-release.sh index 079b545ae7c..caae5148034 100755 --- a/src-release.sh +++ b/src-release.sh @@ -185,7 +185,7 @@ do_tar() echo "==> Making $package-$ver.tar" rm -f $package-$ver.tar find $package-$ver -follow \( $CVS_NAMES \) -prune \ - -o -type f -print \ + -o -type f -print | sort \ | tar cTfh - $package-$ver.tar }