From patchwork Fri Nov 20 20:24:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Sidwell X-Patchwork-Id: 41141 X-Patchwork-Delegate: siddhesh@gotplt.org Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 97CE23861810; Fri, 20 Nov 2020 20:24:55 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qv1-xf32.google.com (mail-qv1-xf32.google.com [IPv6:2607:f8b0:4864:20::f32]) by sourceware.org (Postfix) with ESMTPS id 826113870880 for ; Fri, 20 Nov 2020 20:24:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 826113870880 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=acm.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nathanmsidwell@gmail.com Received: by mail-qv1-xf32.google.com with SMTP id ek7so5301847qvb.6 for ; Fri, 20 Nov 2020 12:24:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:to:cc:from:subject:message-id:date:user-agent:mime-version :content-language; bh=zMIAv9LADj+xmPRpIbgxt+nUu3q1O0Ww/ynZUQqeWto=; b=kOwE00mx8byy5QgJlWZ2Kp+n3A0vS5pOisAqUeLASHQut1YsPtHhaCrpl6FqlOffhC XG3d20ZaSTCIfBwmcF1XB93ipGtUQykCh9ZfwfsxorsuugAAsf1pt9AusLtFK6JJ/v4T c7NFpogCCu2aGsol4TgBfHU7lfnmCe68+uCtlDjz9yVcwafSfWg6DYZLalNWf3IQ/KL8 AL+ABR2An2hMAL29vxHgtJXnXg+QHiKFEuB+jjc02pcwX0r5iGyrWZUt1a745EQmFbP/ vvfL3eexQnBrr+UZCk5mZCYFY15uzzREovvfHh/bbNFgK09FNaFIGMdjbAbYzSB7rkub P5HQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:to:cc:from:subject:message-id:date :user-agent:mime-version:content-language; bh=zMIAv9LADj+xmPRpIbgxt+nUu3q1O0Ww/ynZUQqeWto=; b=ZLIzQpAZOy5l++kaONBQoCD1G9V0Spi79jAsT5NqP0gHjfIlYXzd+R/vRFxgivRTLt WMK3HVETH/tQRRpKEKdv8txPm+6OPYPnIKFXs0S4hPIJ3cfICMTAFSNG7G9eIl5j1tSc KHjnttBxF9wUae8f401FZLauXwcGVBXGt+2Ca3V2ERJ0cej++9jPG9U6itJmBapqs1hV FAcKfwNnPnVJ2d7frg/ad2tEWO2QvCoG/PKfCrGaQM0z40DY2YXykYbMkbA5KrDOE1yS r7eV8KEtIbljJ/taM6fdN2JTy6F3LNZ2TG6WHMEObj31/Oo3izQSszOa2z1MQTA2JIyd Ot4Q== X-Gm-Message-State: AOAM530Lj7lmzZGH624dt4XfnlYp64mG6FenS+jg57tpiCnqoNgA9vx9 9py9xISMi0zG0BsMDKy8oSw= X-Google-Smtp-Source: ABdhPJxE+F954tWYmZVFJ0TLHWfAoAaa/l36MqRYyn2PihNxKH8wcd6UIn3wd188OMqK2GxMV/Pd+A== X-Received: by 2002:ad4:4514:: with SMTP id k20mr17986915qvu.18.1605903891892; Fri, 20 Nov 2020 12:24:51 -0800 (PST) Received: from ?IPv6:2620:10d:c0a8:1102:c465:4233:4b31:87c4? ([2620:10d:c091:480::1:b732]) by smtp.googlemail.com with ESMTPSA id p48sm2856538qtp.67.2020.11.20.12.24.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 20 Nov 2020 12:24:50 -0800 (PST) To: libc-alpha@sourceware.org From: Nathan Sidwell Subject: [PATCH] iconv: Fix incorrect UCS4 inner loop bounds (BZ#26923) Message-ID: <8ed83e8d-66a4-bebe-d041-b66f4f5b5daf@acm.org> Date: Fri, 20 Nov 2020 15:24:49 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 Content-Language: en-US X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" This is a reposting of https://sourceware.org/pipermail/libc-alpha/2020-November/119822.html blessing it with the FB disclaimerness that I have. IANAL. Previously, in UCS4 conversion routines we limit the number of characters we examine to the minimum of the number of characters in the input and the number of characters in the output. This is not the correct behavior when __GCONV_IGNORE_ERRORS is set, as we do not consume an output character when we skip a code unit. Instead, track the input and output pointers and terminate the loop when either reaches its limit. This resolves assertion failures when resetting the input buffer in a step of iconv, which assumes that the input will be fully consumed given sufficient output space. diff --git a/iconv/Makefile b/iconv/Makefile index 30bf996d3a..f9b51e23ec 100644 --- a/iconv/Makefile +++ b/iconv/Makefile @@ -44,7 +44,7 @@ CFLAGS-linereader.c += -DNO_TRANSLITERATION CFLAGS-simple-hash.c += -I../locale tests = tst-iconv1 tst-iconv2 tst-iconv3 tst-iconv4 tst-iconv5 tst-iconv6 \ - tst-iconv7 tst-iconv-mt tst-iconv-opt + tst-iconv7 tst-iconv8 tst-iconv-mt tst-iconv-opt others = iconv_prog iconvconfig install-others-programs = $(inst_bindir)/iconv diff --git a/iconv/gconv_simple.c b/iconv/gconv_simple.c index d4797fba17..963b29f246 100644 --- a/iconv/gconv_simple.c +++ b/iconv/gconv_simple.c @@ -239,11 +239,9 @@ ucs4_internal_loop (struct __gconv_step *step, int flags = step_data->__flags; const unsigned char *inptr = *inptrp; unsigned char *outptr = *outptrp; - size_t n_convert = MIN (inend - inptr, outend - outptr) / 4; int result; - size_t cnt; - for (cnt = 0; cnt < n_convert; ++cnt, inptr += 4) + for (; inptr + 4 <= inend && outptr + 4 <= outend; inptr += 4) { uint32_t inval; @@ -307,11 +305,9 @@ ucs4_internal_loop_unaligned (struct __gconv_step *step, int flags = step_data->__flags; const unsigned char *inptr = *inptrp; unsigned char *outptr = *outptrp; - size_t n_convert = MIN (inend - inptr, outend - outptr) / 4; int result; - size_t cnt; - for (cnt = 0; cnt < n_convert; ++cnt, inptr += 4) + for (; inptr + 4 <= inend && outptr + 4 <= outend; inptr += 4) { if (__glibc_unlikely (inptr[0] > 0x80)) { @@ -613,11 +609,9 @@ ucs4le_internal_loop (struct __gconv_step *step, int flags = step_data->__flags; const unsigned char *inptr = *inptrp; unsigned char *outptr = *outptrp; - size_t n_convert = MIN (inend - inptr, outend - outptr) / 4; int result; - size_t cnt; - for (cnt = 0; cnt < n_convert; ++cnt, inptr += 4) + for (; inptr + 4 <= inend && outptr + 4 <= outend; inptr += 4) { uint32_t inval; @@ -684,11 +678,9 @@ ucs4le_internal_loop_unaligned (struct __gconv_step *step, int flags = step_data->__flags; const unsigned char *inptr = *inptrp; unsigned char *outptr = *outptrp; - size_t n_convert = MIN (inend - inptr, outend - outptr) / 4; int result; - size_t cnt; - for (cnt = 0; cnt < n_convert; ++cnt, inptr += 4) + for (; inptr + 4 <= inend && outptr + 4 <= outend; inptr += 4) { if (__glibc_unlikely (inptr[3] > 0x80)) { diff --git a/iconv/tst-iconv8.c b/iconv/tst-iconv8.c new file mode 100644 index 0000000000..0b92b19f66 --- /dev/null +++ b/iconv/tst-iconv8.c @@ -0,0 +1,50 @@ +/* Test iconv behavior on UCS4 conversions with //IGNORE. + Copyright (C) 2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Derived from BZ #26923 */ +#include +#include +#include +#include + +static int +do_test (void) +{ + iconv_t cd = iconv_open ("UTF-8//IGNORE", "ISO-10646/UCS4/"); + TEST_VERIFY_EXIT (cd != (iconv_t) -1); + + /* + * Convert sequence beginning with an irreversible character into buffer that + * is too small. + */ + char input[12] = "\xe1\x80\xa1" "AAAAAAAAA"; + char *inptr = input; + size_t insize = sizeof (input); + char output[6]; + char *outptr = output; + size_t outsize = sizeof (output); + + TEST_VERIFY (iconv (cd, &inptr, &insize, &outptr, &outsize) == -1); + TEST_VERIFY (errno == E2BIG); + + TEST_VERIFY_EXIT (iconv_close (cd) != -1); + + return 0; +} + +#include