From patchwork Thu Apr 29 17:27:42 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Carlos O'Donell <carlos@redhat.com>
X-Patchwork-Id: 43193
Return-Path: <libc-alpha-bounces@sourceware.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 3E9F9396902B;
	Thu, 29 Apr 2021 17:28:02 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3E9F9396902B
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org;
	s=default; t=1619717282;
	bh=Wfs+IRGdqPeYZbWEcfeM9fYAtxDqEWbqMCZL8wWeJwk=;
	h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post:
	 List-Help:List-Subscribe:From:Reply-To:From;
	b=xGNBla5ec4GtN+DKeV7PLzJwg5UosZWnHHUW/dX2TlYRCbCCATmJnSWKGXc2NQT6y
	 gQO2oKGAd61hV4p5Vhu2Fos8r9t+p7B6LACytUaPN30hodcxce3tlmQv/Jh2++lwPH
	 lVM9L6E+4ODbuamh3P9bwFNO8q5VF0olQ//pFUyw=
X-Original-To: libc-alpha@sourceware.org
Delivered-To: libc-alpha@sourceware.org
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTP id 934C43947412
 for <libc-alpha@sourceware.org>; Thu, 29 Apr 2021 17:27:56 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 934C43947412
Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com
 [209.85.222.198]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-108-2R5g1EYuMhuBSvJ4Z-4ehA-1; Thu, 29 Apr 2021 13:27:54 -0400
X-MC-Unique: 2R5g1EYuMhuBSvJ4Z-4ehA-1
Received: by mail-qk1-f198.google.com with SMTP id
 m1-20020a05620a2201b02902e5493ba894so4278751qkh.17
 for <libc-alpha@sourceware.org>; Thu, 29 Apr 2021 10:27:54 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version
 :content-transfer-encoding;
 bh=Wfs+IRGdqPeYZbWEcfeM9fYAtxDqEWbqMCZL8wWeJwk=;
 b=a0TGbrasXCngofEyNh/tsCp1Onn+j2TPLQxJSVaRz8TYmbYELXDP3LKzhpIfUZpWhS
 ZCC8doiJcOWpZy6AmNISaAqRQkkkt5Pb8xmr7CTemrHXOYmjLlPk/VJdBXi4smOSkgHD
 F6GPc5u6rbY7IrZOAZIM+z3nV/exWkK9tet/E+z1f+BoSZCXDSq4mTQcRgQNRcwD6z4A
 rx+ka1DIU6OMY7kaPyuqTF7OwRlxQ3SRA+BpYOb9em//Y+L+I+PUBRFsp6oAgZ0hKIUk
 8V+YyeDqfgNjd0Ry1QOxz9d6wQdwHZZLudfOZQ2qi76hQwwfhyg3h/wOxX98+11LZjrU
 KyJg==
X-Gm-Message-State: AOAM533+AiMwpbFuL4JklyBjj0j6R0Dqb2CAzuFT8qmXnZQnUTuUUnhC
 l5EZHWusqMOjAWK0MVad0wkX9/Lnd92/zYL5UoPCEsnH9BMhNCR56xS2+mEu/1AJH74mH7k4viw
 ydNKNHlFXaDWtJ8oN4pixRFMDhyAMUesF50EY+XaVGFVALEBoy0A2bmCa1/IMQcxHa9YQjA==
X-Received: by 2002:a37:e30c:: with SMTP id y12mr827981qki.244.1619717272770;
 Thu, 29 Apr 2021 10:27:52 -0700 (PDT)
X-Google-Smtp-Source: 
 ABdhPJyDJMrgz/SaF4CXZHxjIrp1SaGaz3FVTUOvxnPQ/nQcwBs2gYTh/rtfbLrdOHMROte4Rgd2YA==
X-Received: by 2002:a37:e30c:: with SMTP id y12mr827920qki.244.1619717272094;
 Thu, 29 Apr 2021 10:27:52 -0700 (PDT)
Received: from localhost.localdomain (198-84-214-74.cpe.teksavvy.com.
 [198.84.214.74])
 by smtp.gmail.com with ESMTPSA id e12sm363586qtj.81.2021.04.29.10.27.50
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Thu, 29 Apr 2021 10:27:51 -0700 (PDT)
To: libc-alpha@sourceware.org,
	fweimer@redhat.com,
	joseph@codesourcery.com
Subject: [PATCH] Make Unicode generation reproducible.
Date: Thu, 29 Apr 2021 13:27:42 -0400
Message-Id: <20210429172742.3301414-1-carlos@redhat.com>
X-Mailer: git-send-email 2.26.3
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: libc-alpha@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=subscribe>
X-Patchwork-Original-From: Carlos O'Donell via Libc-alpha
 <libc-alpha@sourceware.org>
From: Carlos O'Donell <carlos@redhat.com>
Reply-To: Carlos O'Donell <carlos@redhat.com>
Errors-To: libc-alpha-bounces@sourceware.org
Sender: "Libc-alpha" <libc-alpha-bounces@sourceware.org>

The following changes make Unicode generation reproducible.

First we create a UnicodeRelease.txt file with metadata about the
release. This metadata contains the release date for the Unicode
version that we imported into glibc. Then we add APIs to
unicode_utils.py to access the release metadata. Then we refactor
all of the code use the release metadata, which includes using
the consistent date of the Unicode release for the required
LC_IDENTIFICATION dates.  If the existing files like i18n_ctype
or tr_TR have newer dates then we keep those, otherwise we use the
newer date from the Unicode release.

All data files are regenerated with:
cd localedata/unicode-gen
make
make install

Subsequent regeneration will not alter any file dates and makes
the Unicode generation reproducible.

Tested on x86_64 and i686 without regression.
---
 localedata/locales/i18n_ctype                 |  4 +-
 localedata/locales/tr_TR                      |  2 +-
 localedata/locales/translit_circle            |  2 +-
 localedata/locales/translit_cjk_compat        |  2 +-
 localedata/locales/translit_combining         |  2 +-
 localedata/locales/translit_compat            |  2 +-
 localedata/locales/translit_font              |  2 +-
 localedata/locales/translit_fraction          |  2 +-
 localedata/unicode-gen/Makefile               | 66 ++++++++-----------
 localedata/unicode-gen/UnicodeRelease.txt     |  8 +++
 localedata/unicode-gen/gen_translit_circle.py | 20 +++---
 .../unicode-gen/gen_translit_cjk_compat.py    | 20 +++---
 .../unicode-gen/gen_translit_combining.py     | 20 +++---
 localedata/unicode-gen/gen_translit_compat.py | 20 +++---
 localedata/unicode-gen/gen_translit_font.py   | 20 +++---
 .../unicode-gen/gen_translit_fraction.py      | 20 +++---
 localedata/unicode-gen/gen_unicode_ctype.py   | 50 ++++++--------
 localedata/unicode-gen/unicode_utils.py       | 38 +++++++++++
 localedata/unicode-gen/utf8_compatibility.py  | 27 ++++----
 localedata/unicode-gen/utf8_gen.py            | 61 +++++++----------
 20 files changed, 189 insertions(+), 199 deletions(-)
 create mode 100644 localedata/unicode-gen/UnicodeRelease.txt

diff --git a/localedata/locales/i18n_ctype b/localedata/locales/i18n_ctype
index c63e0790fc..f5063fe743 100644
--- a/localedata/locales/i18n_ctype
+++ b/localedata/locales/i18n_ctype
@@ -13,7 +13,7 @@ comment_char %
 % information, but with different transliterations, can include it
 % directly.
 
-% Generated automatically by gen_unicode_ctype.py for Unicode 12.1.0.
+% Generated automatically by gen_unicode_ctype.py.
 
 LC_IDENTIFICATION
 title     "Unicode 13.0.0 FDCC-set"
@@ -26,7 +26,7 @@ fax       ""
 language  ""
 territory "Earth"
 revision  "13.0.0"
-date      "2020-06-25"
+date      "2021-03-10"
 category  "i18n:2012";LC_CTYPE
 END LC_IDENTIFICATION
 
diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
index 7dbb923228..ff8b315b7b 100644
--- a/localedata/locales/tr_TR
+++ b/localedata/locales/tr_TR
@@ -43,7 +43,7 @@ fax        ""
 language   "Turkish"
 territory  "Turkey"
 revision   "1.0"
-date       "2020-06-25"
+date       "2021-03-10"
 
 category "i18n:2012";LC_IDENTIFICATION
 category "i18n:2012";LC_CTYPE
diff --git a/localedata/locales/translit_circle b/localedata/locales/translit_circle
index 5c07b44532..f2ef558e2d 100644
--- a/localedata/locales/translit_circle
+++ b/localedata/locales/translit_circle
@@ -9,7 +9,7 @@ comment_char %
 % otherwise be governed by that license.
 
 % Transliterations of encircled characters.
-% Generated automatically from UnicodeData.txt by gen_translit_circle.py on 2020-06-25 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_circle.py for Unicode 13.0.0.
 
 LC_CTYPE
 
diff --git a/localedata/locales/translit_cjk_compat b/localedata/locales/translit_cjk_compat
index ee0d7f83c6..2696445dbf 100644
--- a/localedata/locales/translit_cjk_compat
+++ b/localedata/locales/translit_cjk_compat
@@ -9,7 +9,7 @@ comment_char %
 % otherwise be governed by that license.
 
 % Transliterations of CJK compatibility characters.
-% Generated automatically from UnicodeData.txt by gen_translit_cjk_compat.py on 2020-06-25 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_cjk_compat.py for Unicode 13.0.0.
 
 LC_CTYPE
 
diff --git a/localedata/locales/translit_combining b/localedata/locales/translit_combining
index 36128f097a..b8e6b7efbd 100644
--- a/localedata/locales/translit_combining
+++ b/localedata/locales/translit_combining
@@ -10,7 +10,7 @@ comment_char %
 
 % Transliterations that remove all combining characters (accents,
 % pronounciation marks, etc.).
-% Generated automatically from UnicodeData.txt by gen_translit_combining.py on 2020-06-25 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_combining.py for Unicode 13.0.0.
 
 LC_CTYPE
 
diff --git a/localedata/locales/translit_compat b/localedata/locales/translit_compat
index ac24c4e938..61cdcccbc9 100644
--- a/localedata/locales/translit_compat
+++ b/localedata/locales/translit_compat
@@ -9,7 +9,7 @@ comment_char %
 % otherwise be governed by that license.
 
 % Transliterations of compatibility characters and ligatures.
-% Generated automatically from UnicodeData.txt by gen_translit_compat.py on 2020-06-25 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_compat.py for Unicode 13.0.0.
 
 LC_CTYPE
 
diff --git a/localedata/locales/translit_font b/localedata/locales/translit_font
index 680c4ed426..c3d7b44772 100644
--- a/localedata/locales/translit_font
+++ b/localedata/locales/translit_font
@@ -9,7 +9,7 @@ comment_char %
 % otherwise be governed by that license.
 
 % Transliterations of font equivalents.
-% Generated automatically from UnicodeData.txt by gen_translit_font.py on 2020-06-25 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_font.py for Unicode 13.0.0.
 
 LC_CTYPE
 
diff --git a/localedata/locales/translit_fraction b/localedata/locales/translit_fraction
index b52244969e..292fe3e806 100644
--- a/localedata/locales/translit_fraction
+++ b/localedata/locales/translit_fraction
@@ -9,7 +9,7 @@ comment_char %
 % otherwise be governed by that license.
 
 % Transliterations of fractions.
-% Generated automatically from UnicodeData.txt by gen_translit_fraction.py on 2020-06-25 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_fraction.py for Unicode 13.0.0.
 % The replacements have been surrounded with spaces, because fractions are
 % often preceded by a decimal number and followed by a unit or a math symbol.
 
diff --git a/localedata/unicode-gen/Makefile b/localedata/unicode-gen/Makefile
index d0dd1b78a5..b5c9c5517b 100644
--- a/localedata/unicode-gen/Makefile
+++ b/localedata/unicode-gen/Makefile
@@ -18,11 +18,10 @@
 # Makefile for generating and updating Unicode-extracted files.
 
 
-# This Makefile is NOT used as part of the GNU libc build.  It needs
-# to be run manually, within the source tree, at Unicode upgrades
-# (change UNICODE_VERSION below), to update ../locales/i18n_ctype ctype
-# information (part of the file is preserved, so don't wipe it all
-# out), and ../charmaps/UTF-8.
+# This Makefile is NOT used as part of the GNU libc build.  It needs to
+# be run manually, within the source tree, at Unicode upgrades, to
+# update ../locales/i18n_ctype ctype information (part of the file is
+# preserved, so don't wipe it all out), and ../charmaps/UTF-8.
 
 # Use make all to generate the files used in the glibc build out of
 # the original Unicode files; make check to verify that they are what
@@ -33,13 +32,14 @@
 # running afoul of the LGPL corresponding sources requirements, even
 # though it's not clear that they are preferred over the generated
 # files for making modifications.
-
-
-UNICODE_VERSION = 13.0.0
+#
+# The UnicodeRelease.txt file must be updated manually to include the
+# information about the downloaded Unicode release.
 
 PYTHON3 = python3
 WGET = wget
 
+RELEASEDATA = UnicodeRelease.txt
 DOWNLOADS = UnicodeData.txt DerivedCoreProperties.txt EastAsianWidth.txt PropList.txt
 GENERATED = i18n_ctype tr_TR UTF-8 translit_combining translit_compat translit_circle translit_cjk_compat translit_font translit_fraction
 REPORTS = i18n_ctype-report UTF-8-report
@@ -66,12 +66,10 @@ mostlyclean:
 
 .PHONY: all check clean mostlyclean install
 
-i18n_ctype: UnicodeData.txt DerivedCoreProperties.txt
+i18n_ctype: UnicodeData.txt DerivedCoreProperties.txt $(RELEASEDATA)
 i18n_ctype: ../locales/i18n_ctype # Preserve non-ctype information.
 i18n_ctype: gen_unicode_ctype.py
-	$(PYTHON3) gen_unicode_ctype.py -u UnicodeData.txt \
-	  -d DerivedCoreProperties.txt -i ../locales/i18n_ctype -o $@ \
-	  --unicode_version $(UNICODE_VERSION)
+	$(PYTHON3) gen_unicode_ctype.py -i ../locales/i18n_ctype -o $@
 
 i18n_ctype-report: i18n_ctype ../locales/i18n_ctype
 i18n_ctype-report: ctype_compatibility.py ctype_compatibility_test_cases.py
@@ -86,55 +84,45 @@ check-i18n_ctype: i18n_ctype-report
 tr_TR: UnicodeData.txt DerivedCoreProperties.txt
 tr_TR: ../locales/tr_TR # Preserve non-ctype information.
 tr_TR: gen_unicode_ctype.py
-	$(PYTHON3) gen_unicode_ctype.py -u UnicodeData.txt \
-	  -d DerivedCoreProperties.txt -i ../locales/tr_TR -o $@ \
-	  --unicode_version $(UNICODE_VERSION) --turkish
+	$(PYTHON3) gen_unicode_ctype.py -i ../locales/tr_TR -o $@ \
+	  --turkish
 
-UTF-8: UnicodeData.txt EastAsianWidth.txt
+UTF-8: UnicodeData.txt EastAsianWidth.txt $(RELEASEDATA)
 UTF-8: utf8_gen.py
-	$(PYTHON3) utf8_gen.py -u UnicodeData.txt \
-	-e EastAsianWidth.txt -p PropList.txt \
-	--unicode_version $(UNICODE_VERSION)
+	$(PYTHON3) utf8_gen.py
 
 UTF-8-report: UTF-8 ../charmaps/UTF-8
 UTF-8-report: utf8_compatibility.py
-	$(PYTHON3) ./utf8_compatibility.py -u UnicodeData.txt \
-	-e EastAsianWidth.txt -o ../charmaps/UTF-8 \
+	$(PYTHON3) ./utf8_compatibility.py -o ../charmaps/UTF-8 \
 	-n UTF-8 -a -m -c > $@
 
 check-UTF-8: UTF-8-report
 	@if grep '^Total.*: [^0]' UTF-8-report; \
 	then echo manual verification required; false; else true; fi
 
-translit_combining: UnicodeData.txt
+translit_combining: UnicodeData.txt $(RELEASEDATA)
 translit_combining: gen_translit_combining.py
-	$(PYTHON3) ./gen_translit_combining.py -u UnicodeData.txt \
-	-o $@ --unicode_version $(UNICODE_VERSION)
+	$(PYTHON3) ./gen_translit_combining.py -o $@
 
-translit_compat: UnicodeData.txt
+translit_compat: UnicodeData.txt $(RELEASEDATA)
 translit_compat: gen_translit_compat.py
-	$(PYTHON3) ./gen_translit_compat.py -u UnicodeData.txt \
-	-o $@ --unicode_version $(UNICODE_VERSION)
+	$(PYTHON3) ./gen_translit_compat.py -o $@
 
-translit_circle: UnicodeData.txt
+translit_circle: UnicodeData.txt $(RELEASEDATA)
 translit_circle: gen_translit_circle.py
-	$(PYTHON3) ./gen_translit_circle.py -u UnicodeData.txt \
-	-o $@ --unicode_version $(UNICODE_VERSION)
+	$(PYTHON3) ./gen_translit_circle.py -o $@
 
-translit_cjk_compat: UnicodeData.txt
+translit_cjk_compat: UnicodeData.txt $(RELEASEDATA)
 translit_cjk_compat: gen_translit_cjk_compat.py
-	$(PYTHON3) ./gen_translit_cjk_compat.py -u UnicodeData.txt \
-	-o $@ --unicode_version $(UNICODE_VERSION)
+	$(PYTHON3) ./gen_translit_cjk_compat.py -o $@
 
-translit_font: UnicodeData.txt
+translit_font: UnicodeData.txt $(RELEASEDATA)
 translit_font: gen_translit_font.py
-	$(PYTHON3) ./gen_translit_font.py -u UnicodeData.txt \
-	-o $@ --unicode_version $(UNICODE_VERSION)
+	$(PYTHON3) ./gen_translit_font.py -o $@
 
-translit_fraction: UnicodeData.txt
+translit_fraction: UnicodeData.txt $(RELEASEDATA)
 translit_fraction: gen_translit_fraction.py
-	$(PYTHON3) ./gen_translit_fraction.py -u UnicodeData.txt \
-	-o $@ --unicode_version $(UNICODE_VERSION)
+	$(PYTHON3) ./gen_translit_fraction.py -o $@
 
 .PHONY: downloads clean-downloads
 downloads: $(DOWNLOADS)
diff --git a/localedata/unicode-gen/UnicodeRelease.txt b/localedata/unicode-gen/UnicodeRelease.txt
new file mode 100644
index 0000000000..bd9cc14ae0
--- /dev/null
+++ b/localedata/unicode-gen/UnicodeRelease.txt
@@ -0,0 +1,8 @@
+% This metadata is used by glibc and updated by the developer(s)
+% carrying out the Unicode update.
+Version,13.0.0
+ReleaseDate,2021-03-10
+Data,UnicodeData.txt
+DcpData,DerivedCoreProperties.txt
+EawData,EastAsianWidth.txt
+PlData,PropList.txt
diff --git a/localedata/unicode-gen/gen_translit_circle.py b/localedata/unicode-gen/gen_translit_circle.py
index a83dccc163..cc897b2f5f 100644
--- a/localedata/unicode-gen/gen_translit_circle.py
+++ b/localedata/unicode-gen/gen_translit_circle.py
@@ -67,7 +67,6 @@ def output_head(translit_file, unicode_version, head=''):
         translit_file.write('% Transliterations of encircled characters.\n')
         translit_file.write('% Generated automatically from UnicodeData.txt '
                             + 'by gen_translit_circle.py '
-                            + 'on {:s} '.format(time.strftime('%Y-%m-%d'))
                             + 'for Unicode {:s}.\n'.format(unicode_version))
         translit_file.write('\n')
         translit_file.write('LC_CTYPE\n')
@@ -110,11 +109,11 @@ if __name__ == "__main__":
         Generate a translit_circle file from UnicodeData.txt.
         ''')
     PARSER.add_argument(
-        '-u', '--unicode_data_file',
+        '-u', '--unicode_data_dir',
         nargs='?',
         type=str,
-        default='UnicodeData.txt',
-        help=('The UnicodeData.txt file to read, '
+        default='.',
+        help=('The directory containing Unicode data to read, '
               + 'default: %(default)s'))
     PARSER.add_argument(
         '-i', '--input_file',
@@ -133,19 +132,16 @@ if __name__ == "__main__":
         “translit_start” line and the tail from the “translit_end”
         line to the end of the file will be copied unchanged into the
         output file.  ''')
-    PARSER.add_argument(
-        '--unicode_version',
-        nargs='?',
-        required=True,
-        type=str,
-        help='The Unicode version of the input files used.')
     ARGS = PARSER.parse_args()
 
-    unicode_utils.fill_attributes(ARGS.unicode_data_file)
+    unicode_version = unicode_utils.release_version(ARGS.unicode_data_dir)
+    unicode_data_file = unicode_utils.release_data_file(ARGS.unicode_data_dir)
+
+    unicode_utils.fill_attributes(unicode_data_file)
     HEAD = TAIL = ''
     if ARGS.input_file:
         (HEAD, TAIL) = read_input_file(ARGS.input_file)
     with open(ARGS.output_file, mode='w') as TRANSLIT_FILE:
-        output_head(TRANSLIT_FILE, ARGS.unicode_version, head=HEAD)
+        output_head(TRANSLIT_FILE, unicode_version, head=HEAD)
         output_transliteration(TRANSLIT_FILE)
         output_tail(TRANSLIT_FILE, tail=TAIL)
diff --git a/localedata/unicode-gen/gen_translit_cjk_compat.py b/localedata/unicode-gen/gen_translit_cjk_compat.py
index a040511d06..ac127a8e21 100644
--- a/localedata/unicode-gen/gen_translit_cjk_compat.py
+++ b/localedata/unicode-gen/gen_translit_cjk_compat.py
@@ -69,7 +69,6 @@ def output_head(translit_file, unicode_version, head=''):
         translit_file.write('characters.\n')
         translit_file.write('% Generated automatically from UnicodeData.txt '
                             + 'by gen_translit_cjk_compat.py '
-                            + 'on {:s} '.format(time.strftime('%Y-%m-%d'))
                             + 'for Unicode {:s}.\n'.format(unicode_version))
         translit_file.write('\n')
         translit_file.write('LC_CTYPE\n')
@@ -180,11 +179,11 @@ if __name__ == "__main__":
         Generate a translit_cjk_compat file from UnicodeData.txt.
         ''')
     PARSER.add_argument(
-        '-u', '--unicode_data_file',
+        '-u', '--unicode_data_dir',
         nargs='?',
         type=str,
-        default='UnicodeData.txt',
-        help=('The UnicodeData.txt file to read, '
+        default='.',
+        help=('The directory containing Unicode data to read, '
               + 'default: %(default)s'))
     PARSER.add_argument(
         '-i', '--input_file',
@@ -203,19 +202,16 @@ if __name__ == "__main__":
         “translit_start” line and the tail from the “translit_end”
         line to the end of the file will be copied unchanged into the
         output file.  ''')
-    PARSER.add_argument(
-        '--unicode_version',
-        nargs='?',
-        required=True,
-        type=str,
-        help='The Unicode version of the input files used.')
     ARGS = PARSER.parse_args()
 
-    unicode_utils.fill_attributes(ARGS.unicode_data_file)
+    unicode_version = unicode_utils.release_version(ARGS.unicode_data_dir)
+    unicode_data_file = unicode_utils.release_data_file(ARGS.unicode_data_dir)
+
+    unicode_utils.fill_attributes(unicode_data_file)
     HEAD = TAIL = ''
     if ARGS.input_file:
         (HEAD, TAIL) = read_input_file(ARGS.input_file)
     with open(ARGS.output_file, mode='w') as TRANSLIT_FILE:
-        output_head(TRANSLIT_FILE, ARGS.unicode_version, head=HEAD)
+        output_head(TRANSLIT_FILE, unicode_version, head=HEAD)
         output_transliteration(TRANSLIT_FILE)
         output_tail(TRANSLIT_FILE, tail=TAIL)
diff --git a/localedata/unicode-gen/gen_translit_combining.py b/localedata/unicode-gen/gen_translit_combining.py
index 88be8f4b8a..082c0da92c 100644
--- a/localedata/unicode-gen/gen_translit_combining.py
+++ b/localedata/unicode-gen/gen_translit_combining.py
@@ -69,7 +69,6 @@ def output_head(translit_file, unicode_version, head=''):
         translit_file.write('% pronounciation marks, etc.).\n')
         translit_file.write('% Generated automatically from UnicodeData.txt '
                             + 'by gen_translit_combining.py '
-                            + 'on {:s} '.format(time.strftime('%Y-%m-%d'))
                             + 'for Unicode {:s}.\n'.format(unicode_version))
         translit_file.write('\n')
         translit_file.write('LC_CTYPE\n')
@@ -404,11 +403,11 @@ if __name__ == "__main__":
         Generate a translit_combining file from UnicodeData.txt.
         ''')
     PARSER.add_argument(
-        '-u', '--unicode_data_file',
+        '-u', '--unicode_data_dir',
         nargs='?',
         type=str,
-        default='UnicodeData.txt',
-        help=('The UnicodeData.txt file to read, '
+        default='.',
+        help=('The directory containing Unicode data to read, '
               + 'default: %(default)s'))
     PARSER.add_argument(
         '-i', '--input_file',
@@ -427,19 +426,16 @@ if __name__ == "__main__":
         “translit_start” line and the tail from the “translit_end”
         line to the end of the file will be copied unchanged into the
         output file.  ''')
-    PARSER.add_argument(
-        '--unicode_version',
-        nargs='?',
-        required=True,
-        type=str,
-        help='The Unicode version of the input files used.')
     ARGS = PARSER.parse_args()
 
-    unicode_utils.fill_attributes(ARGS.unicode_data_file)
+    unicode_version = unicode_utils.release_version(ARGS.unicode_data_dir)
+    unicode_data_file = unicode_utils.release_data_file(ARGS.unicode_data_dir)
+
+    unicode_utils.fill_attributes(unicode_data_file)
     HEAD = TAIL = ''
     if ARGS.input_file:
         (HEAD, TAIL) = read_input_file(ARGS.input_file)
     with open(ARGS.output_file, mode='w') as TRANSLIT_FILE:
-        output_head(TRANSLIT_FILE, ARGS.unicode_version, head=HEAD)
+        output_head(TRANSLIT_FILE, unicode_version, head=HEAD)
         output_transliteration(TRANSLIT_FILE)
         output_tail(TRANSLIT_FILE, tail=TAIL)
diff --git a/localedata/unicode-gen/gen_translit_compat.py b/localedata/unicode-gen/gen_translit_compat.py
index c8c63b23af..ba144e9bee 100644
--- a/localedata/unicode-gen/gen_translit_compat.py
+++ b/localedata/unicode-gen/gen_translit_compat.py
@@ -68,7 +68,6 @@ def output_head(translit_file, unicode_version, head=''):
         translit_file.write('and ligatures.\n')
         translit_file.write('% Generated automatically from UnicodeData.txt '
                             + 'by gen_translit_compat.py '
-                            + 'on {:s} '.format(time.strftime('%Y-%m-%d'))
                             + 'for Unicode {:s}.\n'.format(unicode_version))
         translit_file.write('\n')
         translit_file.write('LC_CTYPE\n')
@@ -286,11 +285,11 @@ if __name__ == "__main__":
         Generate a translit_compat file from UnicodeData.txt.
         ''')
     PARSER.add_argument(
-        '-u', '--unicode_data_file',
+        '-u', '--unicode_data_dir',
         nargs='?',
         type=str,
-        default='UnicodeData.txt',
-        help=('The UnicodeData.txt file to read, '
+        default='.',
+        help=('The directory containing Unicode data to read, '
               + 'default: %(default)s'))
     PARSER.add_argument(
         '-i', '--input_file',
@@ -309,19 +308,16 @@ if __name__ == "__main__":
         “translit_start” line and the tail from the “translit_end”
         line to the end of the file will be copied unchanged into the
         output file.  ''')
-    PARSER.add_argument(
-        '--unicode_version',
-        nargs='?',
-        required=True,
-        type=str,
-        help='The Unicode version of the input files used.')
     ARGS = PARSER.parse_args()
 
-    unicode_utils.fill_attributes(ARGS.unicode_data_file)
+    unicode_version = unicode_utils.release_version(ARGS.unicode_data_dir)
+    unicode_data_file = unicode_utils.release_data_file(ARGS.unicode_data_dir)
+
+    unicode_utils.fill_attributes(unicode_data_file)
     HEAD = TAIL = ''
     if ARGS.input_file:
         (HEAD, TAIL) = read_input_file(ARGS.input_file)
     with open(ARGS.output_file, mode='w') as TRANSLIT_FILE:
-        output_head(TRANSLIT_FILE, ARGS.unicode_version, head=HEAD)
+        output_head(TRANSLIT_FILE, unicode_version, head=HEAD)
         output_transliteration(TRANSLIT_FILE)
         output_tail(TRANSLIT_FILE, tail=TAIL)
diff --git a/localedata/unicode-gen/gen_translit_font.py b/localedata/unicode-gen/gen_translit_font.py
index db41b47fab..93b2f128fa 100644
--- a/localedata/unicode-gen/gen_translit_font.py
+++ b/localedata/unicode-gen/gen_translit_font.py
@@ -67,7 +67,6 @@ def output_head(translit_file, unicode_version, head=''):
         translit_file.write('% Transliterations of font equivalents.\n')
         translit_file.write('% Generated automatically from UnicodeData.txt '
                             + 'by gen_translit_font.py '
-                            + 'on {:s} '.format(time.strftime('%Y-%m-%d'))
                             + 'for Unicode {:s}.\n'.format(unicode_version))
         translit_file.write('\n')
         translit_file.write('LC_CTYPE\n')
@@ -116,11 +115,11 @@ if __name__ == "__main__":
         Generate a translit_font file from UnicodeData.txt.
         ''')
     PARSER.add_argument(
-        '-u', '--unicode_data_file',
+        '-u', '--unicode_data_dir',
         nargs='?',
         type=str,
-        default='UnicodeData.txt',
-        help=('The UnicodeData.txt file to read, '
+        default='.',
+        help=('The directory containing Unicode data to read, '
               + 'default: %(default)s'))
     PARSER.add_argument(
         '-i', '--input_file',
@@ -139,19 +138,16 @@ if __name__ == "__main__":
         “translit_start” line and the tail from the “translit_end”
         line to the end of the file will be copied unchanged into the
         output file.  ''')
-    PARSER.add_argument(
-        '--unicode_version',
-        nargs='?',
-        required=True,
-        type=str,
-        help='The Unicode version of the input files used.')
     ARGS = PARSER.parse_args()
 
-    unicode_utils.fill_attributes(ARGS.unicode_data_file)
+    unicode_version = unicode_utils.release_version(ARGS.unicode_data_dir)
+    unicode_data_file = unicode_utils.release_data_file(ARGS.unicode_data_dir)
+
+    unicode_utils.fill_attributes(unicode_data_file)
     HEAD = TAIL = ''
     if ARGS.input_file:
         (HEAD, TAIL) = read_input_file(ARGS.input_file)
     with open(ARGS.output_file, mode='w') as TRANSLIT_FILE:
-        output_head(TRANSLIT_FILE, ARGS.unicode_version, head=HEAD)
+        output_head(TRANSLIT_FILE, unicode_version, head=HEAD)
         output_transliteration(TRANSLIT_FILE)
         output_tail(TRANSLIT_FILE, tail=TAIL)
diff --git a/localedata/unicode-gen/gen_translit_fraction.py b/localedata/unicode-gen/gen_translit_fraction.py
index c3c1513eb9..097cb04ea0 100644
--- a/localedata/unicode-gen/gen_translit_fraction.py
+++ b/localedata/unicode-gen/gen_translit_fraction.py
@@ -67,7 +67,6 @@ def output_head(translit_file, unicode_version, head=''):
         translit_file.write('% Transliterations of fractions.\n')
         translit_file.write('% Generated automatically from UnicodeData.txt '
                             + 'by gen_translit_fraction.py '
-                            + 'on {:s} '.format(time.strftime('%Y-%m-%d'))
                             + 'for Unicode {:s}.\n'.format(unicode_version))
         translit_file.write('% The replacements have been surrounded ')
         translit_file.write('with spaces, because fractions are\n')
@@ -157,11 +156,11 @@ if __name__ == "__main__":
         Generate a translit_cjk_compat file from UnicodeData.txt.
         ''')
     PARSER.add_argument(
-        '-u', '--unicode_data_file',
+        '-u', '--unicode_data_dir',
         nargs='?',
         type=str,
-        default='UnicodeData.txt',
-        help=('The UnicodeData.txt file to read, '
+        default='.',
+        help=('The directory containing Unicode data to read, '
               + 'default: %(default)s'))
     PARSER.add_argument(
         '-i', '--input_file',
@@ -180,19 +179,16 @@ if __name__ == "__main__":
         “translit_start” line and the tail from the “translit_end”
         line to the end of the file will be copied unchanged into the
         output file.  ''')
-    PARSER.add_argument(
-        '--unicode_version',
-        nargs='?',
-        required=True,
-        type=str,
-        help='The Unicode version of the input files used.')
     ARGS = PARSER.parse_args()
 
-    unicode_utils.fill_attributes(ARGS.unicode_data_file)
+    unicode_version = unicode_utils.release_version(ARGS.unicode_data_dir)
+    unicode_data_file = unicode_utils.release_data_file(ARGS.unicode_data_dir)
+
+    unicode_utils.fill_attributes(unicode_data_file)
     HEAD = TAIL = ''
     if ARGS.input_file:
         (HEAD, TAIL) = read_input_file(ARGS.input_file)
     with open(ARGS.output_file, mode='w') as TRANSLIT_FILE:
-        output_head(TRANSLIT_FILE, ARGS.unicode_version, head=HEAD)
+        output_head(TRANSLIT_FILE, unicode_version, head=HEAD)
         output_transliteration(TRANSLIT_FILE)
         output_tail(TRANSLIT_FILE, tail=TAIL)
diff --git a/localedata/unicode-gen/gen_unicode_ctype.py b/localedata/unicode-gen/gen_unicode_ctype.py
index 7548961df1..41760567cf 100755
--- a/localedata/unicode-gen/gen_unicode_ctype.py
+++ b/localedata/unicode-gen/gen_unicode_ctype.py
@@ -32,6 +32,7 @@ To see how this script is used, call it with the “-h” option:
 import argparse
 import time
 import re
+import datetime
 import unicode_utils
 
 def code_point_ranges(is_class_function):
@@ -123,7 +124,7 @@ def output_charmap(i18n_file, map_name, map_function):
         i18n_file.write(line+'\n')
     i18n_file.write('\n')
 
-def read_input_file(filename):
+def read_input_file(filename, unicode_release_date):
     '''Reads the original glibc i18n file to get the original head
     and tail.
 
@@ -140,8 +141,13 @@ def read_input_file(filename):
                 r'^(?P<key>date\s+)(?P<value>"[0-9]{4}-[0-9]{2}-[0-9]{2}")',
                 line)
             if match:
-                line = match.group('key') \
-                       + '"{:s}"\n'.format(time.strftime('%Y-%m-%d'))
+                # Update the file date if the Unicode standard date
+                # is newer.
+                orig_date = datetime.date.fromisoformat(match.group('value').strip('"'))
+                new_date = datetime.date.fromisoformat(unicode_release_date)
+                if new_date > orig_date:
+                    line = match.group('key') \
+                           + '"{:s}"\n'.format(unicode_release_date)
             head = head + line
             if line.startswith('LC_CTYPE'):
                 break
@@ -153,7 +159,7 @@ def read_input_file(filename):
             tail = tail + line
     return (head, tail)
 
-def output_head(i18n_file, unicode_version, head=''):
+def output_head(i18n_file, unicode_version, unicode_release_date, head=''):
     '''Write the header of the output file, i.e. the part of the file
     before the “LC_CTYPE” line.
     '''
@@ -180,8 +186,7 @@ def output_head(i18n_file, unicode_version, head=''):
         i18n_file.write('language  ""\n')
         i18n_file.write('territory "Earth"\n')
         i18n_file.write('revision  "{:s}"\n'.format(unicode_version))
-        i18n_file.write('date      "{:s}"\n'.format(
-            time.strftime('%Y-%m-%d')))
+        i18n_file.write('date      "{:s}"\n'.format(unicode_release_date))
         i18n_file.write('category  "i18n:2012";LC_CTYPE\n')
         i18n_file.write('END LC_IDENTIFICATION\n')
         i18n_file.write('\n')
@@ -267,18 +272,11 @@ if __name__ == "__main__":
         UnicodeData.txt and DerivedCoreProperties.txt files.
         ''')
     PARSER.add_argument(
-        '-u', '--unicode_data_file',
+        '-u', '--unicode_data_dir',
         nargs='?',
         type=str,
-        default='UnicodeData.txt',
-        help=('The UnicodeData.txt file to read, '
-              + 'default: %(default)s'))
-    PARSER.add_argument(
-        '-d', '--derived_core_properties_file',
-        nargs='?',
-        type=str,
-        default='DerivedCoreProperties.txt',
-        help=('The DerivedCoreProperties.txt file to read, '
+        default='.',
+        help=('The directory containing Unicode data to read, '
               + 'default: %(default)s'))
     PARSER.add_argument(
         '-i', '--input_file',
@@ -298,27 +296,21 @@ if __name__ == "__main__":
         classes and the date stamp in
         LC_IDENTIFICATION will be copied unchanged
         into the output file.  ''')
-    PARSER.add_argument(
-        '--unicode_version',
-        nargs='?',
-        required=True,
-        type=str,
-        help='The Unicode version of the input files used.')
     PARSER.add_argument(
         '--turkish',
         action='store_true',
         help='Use Turkish case conversions.')
     ARGS = PARSER.parse_args()
 
-    unicode_utils.fill_attributes(
-        ARGS.unicode_data_file)
-    unicode_utils.fill_derived_core_properties(
-        ARGS.derived_core_properties_file)
+    unicode_version = unicode_utils.release_version (ARGS.unicode_data_dir)
+    unicode_release_date = unicode_utils.release_date (ARGS.unicode_data_dir)
+    unicode_utils.fill_attributes(unicode_utils.release_data_file(ARGS.unicode_data_dir))
+    unicode_utils.fill_derived_core_properties(unicode_utils.release_dcp_file(ARGS.unicode_data_dir))
     unicode_utils.verifications()
     HEAD = TAIL = ''
     if ARGS.input_file:
-        (HEAD, TAIL) = read_input_file(ARGS.input_file)
+        (HEAD, TAIL) = read_input_file(ARGS.input_file, unicode_release_date)
     with open(ARGS.output_file, mode='w') as I18N_FILE:
-        output_head(I18N_FILE, ARGS.unicode_version, head=HEAD)
-        output_tables(I18N_FILE, ARGS.unicode_version, ARGS.turkish)
+        output_head(I18N_FILE, unicode_version, unicode_release_date, head=HEAD)
+        output_tables(I18N_FILE, unicode_version, ARGS.turkish)
         output_tail(I18N_FILE, tail=TAIL)
diff --git a/localedata/unicode-gen/unicode_utils.py b/localedata/unicode-gen/unicode_utils.py
index 3263f4510b..2b7c6aaa45 100644
--- a/localedata/unicode-gen/unicode_utils.py
+++ b/localedata/unicode-gen/unicode_utils.py
@@ -525,3 +525,41 @@ def verifications():
             and (is_graph(code_point) or code_point == 0x0020)):
             sys.stderr.write('%(sym)s is graph|<space> but not print\n' %{
                 'sym': unicode_utils.ucs_symbol(code_point)})
+
+def release_metadata(data_dir, parameter):
+    ''' Parse the UnicodeRelease.txt metadata and return the value for
+    the specified parameter.'''
+    value = ""
+    with open(data_dir + '/' + "UnicodeRelease.txt", "r") as f:
+        for line in f:
+            if line.strip()[0] == '%':
+                continue
+            fields = line.strip().split(",")
+            if fields[0] == parameter:
+                value = fields[1].strip()
+    assert value != ""
+    return value
+
+def release_version(data_dir):
+    ''' Return the Unicode version of the data in use.'''
+    return release_metadata(data_dir, "Version")
+
+def release_date(data_dir):
+    ''' Release the release date for the Unicode version of the data.'''
+    return release_metadata(data_dir, "ReleaseDate")
+
+def release_data_file(data_dir):
+    ''' The name of the primary data file.'''
+    return data_dir + '/' + release_metadata(data_dir, 'Data')
+
+def release_dcp_file(data_dir):
+    ''' The name of the derived core properties data file.'''
+    return data_dir + '/' + release_metadata(data_dir, 'DcpData')
+
+def release_eaw_file(data_dir):
+    ''' The name of the East Asian width data file.'''
+    return data_dir + '/' + release_metadata(data_dir, 'EawData')
+
+def release_pl_file(data_dir):
+    ''' The name of the properties list data file.'''
+    return data_dir + '/' + release_metadata(data_dir, 'PlData')
diff --git a/localedata/unicode-gen/utf8_compatibility.py b/localedata/unicode-gen/utf8_compatibility.py
index eca2e8cddc..7e485ba759 100755
--- a/localedata/unicode-gen/utf8_compatibility.py
+++ b/localedata/unicode-gen/utf8_compatibility.py
@@ -216,6 +216,13 @@ if __name__ == "__main__":
         description='''
         Compare the contents of LC_CTYPE in two files and check for errors.
         ''')
+    PARSER.add_argument(
+        '-u', '--unicode_data_dir',
+        nargs='?',
+        type=str,
+        default='.',
+        help=('The directory containing Unicode data to read, '
+              + 'default: %(default)s'))
     PARSER.add_argument(
         '-o', '--old_utf8_file',
         nargs='?',
@@ -228,16 +235,6 @@ if __name__ == "__main__":
         required=True,
         type=str,
         help='The new UTF-8 file.')
-    PARSER.add_argument(
-        '-u', '--unicode_data_file',
-        nargs='?',
-        type=str,
-        help='The UnicodeData.txt file to read.')
-    PARSER.add_argument(
-        '-e', '--east_asian_width_file',
-        nargs='?',
-        type=str,
-        help='The EastAsianWidth.txt file to read.')
     PARSER.add_argument(
         '-a', '--show_added_characters',
         action='store_true',
@@ -252,9 +249,11 @@ if __name__ == "__main__":
         help='Show characters whose width was changed in detail.')
     ARGS = PARSER.parse_args()
 
-    if ARGS.unicode_data_file:
-        unicode_utils.fill_attributes(ARGS.unicode_data_file)
-    if ARGS.east_asian_width_file:
-        unicode_utils.fill_east_asian_widths(ARGS.east_asian_width_file)
+    unicode_data_file = unicode_utils.release_data_file (ARGS.unicode_data_dir)
+    east_asian_width_file = unicode_utils.release_eaw_file (ARGS.unicode_data_dir)
+
+    unicode_utils.fill_attributes(unicode_data_file)
+    unicode_utils.fill_east_asian_widths(east_asian_width_file)
+
     check_charmap(ARGS.old_utf8_file, ARGS.new_utf8_file)
     check_width(ARGS.old_utf8_file, ARGS.new_utf8_file)
diff --git a/localedata/unicode-gen/utf8_gen.py b/localedata/unicode-gen/utf8_gen.py
index 899840923a..4fc3038fe0 100755
--- a/localedata/unicode-gen/utf8_gen.py
+++ b/localedata/unicode-gen/utf8_gen.py
@@ -22,7 +22,7 @@
 This script generates a glibc/localedata/charmaps/UTF-8 file
 from Unicode data.
 
-Usage: python3 utf8_gen.py UnicodeData.txt EastAsianWidth.txt
+Usage: python3 utf8_gen.py
 
 It will output UTF-8 file
 '''
@@ -198,23 +198,27 @@ def write_header_charmap(outfile):
     outfile.write("% alias ISO-10646/UTF-8\n")
     outfile.write("CHARMAP\n")
 
-def write_header_width(outfile, unicode_version):
+def write_header_width(outfile, unicode_data_dir):
     '''Writes the header on top of the WIDTH section to the output file'''
+    unicode_version = unicode_utils.release_version(unicode_data_dir)
+    unicode_data = unicode_utils.release_metadata(unicode_data_dir, 'Data')
+    eaw_data = unicode_utils.release_metadata(unicode_data_dir, 'EawData')
+    pl_data = unicode_utils.release_metadata(unicode_data_dir, 'PlData')
     outfile.write('% Character width according to Unicode '
                   + '{:s}.\n'.format(unicode_version))
     outfile.write('% - Default width is 1.\n')
     outfile.write('% - Double-width characters have width 2; generated from\n')
-    outfile.write('%        "grep \'^[^;]*;[WF]\' EastAsianWidth.txt"\n')
+    outfile.write('%        "grep \'^[^;]*;[WF]\' ' + eaw_data + '"\n')
     outfile.write('% - Non-spacing characters have width 0; '
-                  + 'generated from PropList.txt or\n')
+                  + 'generated from ' + pl_data + ' or\n')
     outfile.write('%   "grep \'^[^;]*;[^;]*;[^;]*;[^;]*;NSM;\' '
-                  + 'UnicodeData.txt"\n')
+                  + unicode_data  + '"\n')
     outfile.write('% - Format control characters have width 0; '
                   + 'generated from\n')
-    outfile.write("%   \"grep '^[^;]*;[^;]*;Cf;' UnicodeData.txt\"\n")
+    outfile.write("%   \"grep '^[^;]*;[^;]*;Cf;' " + unicode_data + "\"\n")
 #   Not needed covered by Cf
 #    outfile.write("% - Zero width characters have width 0; generated from\n")
-#    outfile.write("%   \"grep '^[^;]*;ZERO WIDTH ' UnicodeData.txt\"\n")
+#    outfile.write("%   \"grep '^[^;]*;ZERO WIDTH ' " + unicode_data + "\"\n")
     outfile.write("WIDTH\n")
 
 def process_width(outfile, ulines, elines, plines):
@@ -302,41 +306,26 @@ def process_width(outfile, ulines, elines, plines):
 if __name__ == "__main__":
     PARSER = argparse.ArgumentParser(
         description='''
-        Generate a UTF-8 file from UnicodeData.txt, EastAsianWidth.txt, and PropList.txt.
+        Generate a UTF-8 file from the Unicode release data files.
         ''')
     PARSER.add_argument(
-        '-u', '--unicode_data_file',
+        '-u', '--unicode_data_dir',
         nargs='?',
         type=str,
-        default='UnicodeData.txt',
-        help=('The UnicodeData.txt file to read, '
+        default='.',
+        help=('The directory containing Unicode data to read, '
               + 'default: %(default)s'))
-    PARSER.add_argument(
-        '-e', '--east_asian_with_file',
-        nargs='?',
-        type=str,
-        default='EastAsianWidth.txt',
-        help=('The EastAsianWidth.txt file to read, '
-              + 'default: %(default)s'))
-    PARSER.add_argument(
-        '-p', '--prop_list_file',
-        nargs='?',
-        type=str,
-        default='PropList.txt',
-        help=('The PropList.txt file to read, '
-              + 'default: %(default)s'))
-    PARSER.add_argument(
-        '--unicode_version',
-        nargs='?',
-        required=True,
-        type=str,
-        help='The Unicode version of the input files used.')
     ARGS = PARSER.parse_args()
 
-    unicode_utils.fill_attributes(ARGS.unicode_data_file)
-    with open(ARGS.unicode_data_file, mode='r') as UNIDATA_FILE:
+    unicode_version = unicode_utils.release_version(ARGS.unicode_data_dir)
+    unicode_data_file = unicode_utils.release_data_file(ARGS.unicode_data_dir)
+    east_asian_width_file = unicode_utils.release_eaw_file(ARGS.unicode_data_dir)
+    prop_list_file = unicode_utils.release_pl_file(ARGS.unicode_data_dir)
+
+    unicode_utils.fill_attributes(unicode_data_file)
+    with open(unicode_data_file, mode='r') as UNIDATA_FILE:
         UNICODE_DATA_LINES = UNIDATA_FILE.readlines()
-    with open(ARGS.east_asian_with_file, mode='r') as EAST_ASIAN_WIDTH_FILE:
+    with open(east_asian_width_file, mode='r') as EAST_ASIAN_WIDTH_FILE:
         EAST_ASIAN_WIDTH_LINES = []
         for LINE in EAST_ASIAN_WIDTH_FILE:
             # If characters from EastAasianWidth.txt which are from
@@ -352,7 +341,7 @@ if __name__ == "__main__":
                 continue
             if re.match(r'^[^;]*;[WF]', LINE):
                 EAST_ASIAN_WIDTH_LINES.append(LINE.strip())
-    with open(ARGS.prop_list_file, mode='r') as PROP_LIST_FILE:
+    with open(prop_list_file, mode='r') as PROP_LIST_FILE:
         PROP_LIST_LINES = []
         for LINE in PROP_LIST_FILE:
             if re.match(r'^[^;]*;[\s]*Prepended_Concatenation_Mark', LINE):
@@ -363,7 +352,7 @@ if __name__ == "__main__":
         process_charmap(UNICODE_DATA_LINES, OUTFILE)
         OUTFILE.write("END CHARMAP\n\n")
         # Processing EastAsianWidth.txt and write WIDTH to UTF-8 file
-        write_header_width(OUTFILE, ARGS.unicode_version)
+        write_header_width(OUTFILE, ARGS.unicode_data_dir)
         process_width(OUTFILE,
                       UNICODE_DATA_LINES,
                       EAST_ASIAN_WIDTH_LINES,