From patchwork Mon Jan 8 09:53:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mike FABIAN X-Patchwork-Id: 83517 X-Patchwork-Delegate: mfabian@redhat.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 622C5385E453 for ; Mon, 8 Jan 2024 09:54:39 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id CAB2D385E019 for ; Mon, 8 Jan 2024 09:54:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CAB2D385E019 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CAB2D385E019 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704707645; cv=none; b=GbiYJ2HHdBNiguelHhjduOAAQWGOnEX/zBSBrxL+A09qkupq+BdHJ6UIHd1HRKhxiR02Mv4+mk8YuTmuZgsidBx94K8ezgtsA7i0CXYt7vEEb+SQ5OMEUJn0havKF3HZvtekRezk13Eu07OANXxF8AX6hs5CNmm5sb4eyBQvgo0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704707645; c=relaxed/simple; bh=On6THQstRJlgk11TGZ4GtmB9mbp0TbG9wGbaTugljI4=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=FlxcMf9n+0GrFscNn9aVB42k2w82HJbRF+wKH9hwXc8UGF/mINiwO1DxnNtBrcwZ7qHxZN243tBu10HWNVzZC2Yc2oG9PKJIbNBHfrNrLdS9xVmPA+l4wmWfj2UJc452QTopcHotPCV+CeBiPBzt95gb3VM8v+oUd23Hu39RoEU= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1704707643; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/adT7H2UK3CXsprVvuBHkJhPDQyqBh1urI0odZSvGYg=; b=hSS3WLgUstMNJL1tMyG4SOMciz/LisuUKWVZbUVnOid83RRX3NZbIuBHBzUPGQFvdGiY67 qlFFIMAr8fEs7Cey1MdqO/3sC8F8VZeJvwi6qp3cohcj/bT4tdPViW6wmCUs7wLoAvr9QH o1UPNsEKsUS8L4a9EhC3/sirhjYREIE= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-453-DxmNUneZMvebh2vdK0byrw-1; Mon, 08 Jan 2024 04:54:00 -0500 X-MC-Unique: DxmNUneZMvebh2vdK0byrw-1 Received: by mail-wr1-f71.google.com with SMTP id ffacd0b85a97d-33677bbd570so1159379f8f.3 for ; Mon, 08 Jan 2024 01:53:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704707638; x=1705312438; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/adT7H2UK3CXsprVvuBHkJhPDQyqBh1urI0odZSvGYg=; b=iREu++iA+iq5i26r9z7NTb/mMOWb1tphjbrfv51GZINbC+Ef1JJXTd2H4Z6qSS1HFR YVJzt4/EUYaxGNZHbe8iZGoA/VxcVJtS7MTCKkWCrOZsn1yEqAfpP/K0MDvjRROafKkQ IAiO18bdKlryBIiEiKvwziye4KF7aNrR3YRN6kQsnt8Zg0kqJ267yCTFN2QNNP1yjQfs Dvc89KR0h630VlSgpHnNqLU13O6KPFHRUynXN0EON17SkTsIjAKpaxOwQI7k9uCRdIx1 zWymdkpN8ZKpZOiF5jjttmFXd2D+P74JTFsJu9/rnNYeavHiYHWSvpd7OwyYpUC1brT1 j68Q== X-Gm-Message-State: AOJu0YzYNGANOx2gV07DmK/+0iDQg9ZFx5xsIvsptmbcTnhbfMikjihi eSW3E6GWRHeVbVv/OleJBBhQFDW2Ijr7N9KwgSOa3UcQeUvA9btHqSI4NHDknT1v1MVmd2GiQBp AtV939QhHC2l5dlY5KNnjJr+P3K3O7A4Q1uE= X-Received: by 2002:a05:600c:1f90:b0:40d:5b1c:2714 with SMTP id je16-20020a05600c1f9000b0040d5b1c2714mr1885903wmb.111.1704707638525; Mon, 08 Jan 2024 01:53:58 -0800 (PST) X-Google-Smtp-Source: AGHT+IFgnx4hQuV3i0wqJb5OFwHbAcA5u8fAeeo6N7pX5ECLuCkDoGp1ngRT8SaE6/xSgg5BzXqOiA== X-Received: by 2002:a05:600c:1f90:b0:40d:5b1c:2714 with SMTP id je16-20020a05600c1f9000b0040d5b1c2714mr1885897wmb.111.1704707638216; Mon, 08 Jan 2024 01:53:58 -0800 (PST) Received: from hathi.site (p5dc1778a.dip0.t-ipconnect.de. [93.193.119.138]) by smtp.gmail.com with ESMTPSA id fm15-20020a05600c0c0f00b0040e47dc2e8fsm2187591wmb.6.2024.01.08.01.53.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jan 2024 01:53:57 -0800 (PST) Received: by hathi.site (Postfix, from userid 10030) id 27E88848DC; Mon, 8 Jan 2024 10:53:57 +0100 (CET) From: Mike FABIAN To: libc-alpha@sourceware.org Cc: Mike FABIAN Subject: [PATCH 2/2] localedata: unicode-gen: Remove redundant \s* from regexp, fix comments Date: Mon, 8 Jan 2024 10:53:53 +0100 Message-ID: <20240108095355.1200521-2-mfabian@redhat.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240108095355.1200521-1-mfabian@redhat.com> References: <20240108095355.1200521-1-mfabian@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org --- localedata/charmaps/UTF-8 | 2 +- localedata/unicode-gen/utf8_gen.py | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/localedata/charmaps/UTF-8 b/localedata/charmaps/UTF-8 index 94f20d5e87..b545cc9b25 100644 --- a/localedata/charmaps/UTF-8 +++ b/localedata/charmaps/UTF-8 @@ -49858,7 +49858,7 @@ END CHARMAP % Character width according to Unicode 15.1.0. % - Default width is 1. % - Double-width characters have width 2; generated from -% "grep '^[^;]*;[WF]' EastAsianWidth.txt" +% "grep '^[^;]*;\s*[WF]' EastAsianWidth.txt" % - Non-spacing characters have width 0; generated from PropList.txt or % "grep '^[^;]*;[^;]*;[^;]*;[^;]*;NSM;' UnicodeData.txt" % - Format control characters have width 0; generated from diff --git a/localedata/unicode-gen/utf8_gen.py b/localedata/unicode-gen/utf8_gen.py index 5e77333bb4..f744e87ffc 100755 --- a/localedata/unicode-gen/utf8_gen.py +++ b/localedata/unicode-gen/utf8_gen.py @@ -204,7 +204,7 @@ def write_header_width(outfile, unicode_version): + '{:s}.\n'.format(unicode_version)) outfile.write('% - Default width is 1.\n') outfile.write('% - Double-width characters have width 2; generated from\n') - outfile.write('% "grep \'^[^;]*;[WF]\' EastAsianWidth.txt"\n') + outfile.write('% "grep \'^[^;]*;\\s*[WF]\' EastAsianWidth.txt"\n') outfile.write('% - Non-spacing characters have width 0; ' + 'generated from PropList.txt or\n') outfile.write('% "grep \'^[^;]*;[^;]*;[^;]*;[^;]*;NSM;\' ' @@ -339,8 +339,8 @@ if __name__ == "__main__": with open(ARGS.east_asian_with_file, mode='r') as EAST_ASIAN_WIDTH_FILE: EAST_ASIAN_WIDTH_LINES = [] for LINE in EAST_ASIAN_WIDTH_FILE: - # If characters from EastAasianWidth.txt which are from - # from reserved ranges (i.e. not yet assigned code points) + # If characters from EastAsianWidth.txt which are from + # reserved ranges (i.e. not yet assigned code points) # are added to the WIDTH section of the UTF-8 file, then # “make check” produces “Unknown Character” errors for # these code points because such unassigned code points @@ -350,7 +350,7 @@ if __name__ == "__main__": # the EastAsianWidth.txt file. if re.match(r'.*\.\..*', LINE): continue - if re.match(r'^[^;]*;\s*[WF]\s*', LINE): + if re.match(r'^[^;]*;\s*[WF]', LINE): EAST_ASIAN_WIDTH_LINES.append(LINE.strip()) with open(ARGS.prop_list_file, mode='r') as PROP_LIST_FILE: PROP_LIST_LINES = []