From patchwork Thu May 19 21:06:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 54238 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CE86B3839C57 for ; Thu, 19 May 2022 21:08:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CE86B3839C57 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1652994503; bh=59fmDdnrmni+rTtZFpeOD5bSjHA6KBFoo5j5IvL5o3Q=; h=To:Subject:In-Reply-To:References:Date:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=UGUS1lwvCdnPlLLzaA8eZ1pCnO5LPodj8rc2e0j+9TE/6dqSB1GNVrTvSqQ6XG/3G 4OqU5wPM+Tp4APSHweb620VIBuuA7ag+4/QHrXZFnoC5g9T8minzClKhI6piVyl8EI 2a+7Yu4jxYOADxY3cQC9bk+BFH0/MqNj5BoGdHZ0= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 15FED3839C6A for ; Thu, 19 May 2022 21:06:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 15FED3839C6A Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-601-Xzycb5DsNnqrhNPs6bQcnQ-1; Thu, 19 May 2022 17:06:37 -0400 X-MC-Unique: Xzycb5DsNnqrhNPs6bQcnQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 476B92A59541 for ; Thu, 19 May 2022 21:06:37 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.39.192.58]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B39D17AD5 for ; Thu, 19 May 2022 21:06:36 +0000 (UTC) To: libc-alpha@sourceware.org Subject: [PATCH 2/5] locale: Fix signed char bug in lr_getc In-Reply-To: References: X-From-Line: 619cade7e73dc33184bf4247b739d54cd9d7d8b3 Mon Sep 17 00:00:00 2001 Message-Id: <619cade7e73dc33184bf4247b739d54cd9d7d8b3.1652994079.git.fweimer@redhat.com> Date: Thu, 19 May 2022 23:06:34 +0200 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Florian Weimer via Libc-alpha From: Florian Weimer Reply-To: Florian Weimer Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" The array lr->buf contains characters, which can be signed. A 0xff byte in the input could be incorrectly reported as EOF. More importantly, get_string in linereader.c converts a signed input byte to a Unicode code point using ADDWC ((uint32_t) ch), under the assumption that this decodes the ISO-8859-1 input encoding. If char is signed, this does not give the correct result. This means that ISO-8859-1 input files for localedef are not actually supported, contrary to the comment in get_string. This is a happy accident because we can therefore change the file encoding to UTF-8 without impacting backwards compatibility. While at it, remove the \32 check for MS-DOS end-of-file character (^Z). Reviewed-by: Carlos O'Donell Tested-by: Carlos O'Donell --- locale/programs/linereader.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/locale/programs/linereader.h b/locale/programs/linereader.h index 0fb10ec833..653a71d2d1 100644 --- a/locale/programs/linereader.h +++ b/locale/programs/linereader.h @@ -134,7 +134,7 @@ lr_getc (struct linereader *lr) return EOF; } - return lr->buf[lr->idx] == '\32' ? EOF : lr->buf[lr->idx++]; + return lr->buf[lr->idx++] & 0xff; }