From patchwork Thu Mar 19 15:52:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 38584 Return-Path: X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x743.google.com (mail-qk1-x743.google.com [IPv6:2607:f8b0:4864:20::743]) by sourceware.org (Postfix) with ESMTPS id 083023877008 for ; Thu, 19 Mar 2020 15:52:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 083023877008 Received: by mail-qk1-x743.google.com with SMTP id d8so3579947qka.2 for ; Thu, 19 Mar 2020 08:52:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=VTvu43YK0hy4mOA4/Ba1ECWJISkaLFoTT0RGfQrJf80=; b=m5eyNVfA6gH1E2HMs1cgS1vEsc/dC92BPRGIWf5PFOg/c8wbvjuErCDlOzbYhzAhEv P/t0ncu9IJJtXWuRhByyM0fbtyDbxmEhGHlVt5TtHe/1x5SIXP1xS2DpQr+sOSctTQKF tGoa7I3kFZCDQz+aev//qgJ39i05l7TlyZzEWEiP9fdAAYnhdVD8YoV06Plv4KRv9t85 qDSXJyD8avzKUsiTU0kj07eG10rw+wDwD4cJH243/IskyRbmRMv+zM8TM9w+oAXCrRNV KDEi1/UD92VicalxO0owyHKutS3bErEX1I2ZPse+cOYV7NdykbVC1KnqD2nOeW2cQbpx Inbw== X-Gm-Message-State: ANhLgQ1fpn1ijHgsNjWDS1m0aec/kinBwB8uAhSYWRw+OTKDMF5tqKPn hz9gzRPQ5wOcaeXcdXBtxqzdF1WUeQA= X-Google-Smtp-Source: ADFU+vvhQFWs0OVaipZGOFQW9/zc4XJjgn56S/OAMSyq7eujegK9L9UKdiyEgSyjcDOgdML4eGJEFA== X-Received: by 2002:ae9:f501:: with SMTP id o1mr3656561qkg.204.1584633128970; Thu, 19 Mar 2020 08:52:08 -0700 (PDT) Received: from localhost.localdomain ([177.194.48.209]) by smtp.googlemail.com with ESMTPSA id m1sm1833292qtk.16.2020.03.19.08.52.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Mar 2020 08:52:08 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH] stdio: Remove memory leak from multibyte convertion (BZ#25691) Date: Thu, 19 Mar 2020 12:52:02 -0300 Message-Id: <20200319155202.12366-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.17.1 X-Spam-Status: No, score=-25.2 required=5.0 tests=DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Mar 2020 15:52:11 -0000 Both default and wide printf functions might leak memory when manipulate multibyte characters conversion depending of the size of the input (whether __libc_use_alloca trigger or not the fallback heap allocation). This patch fixes it by issuing the required memory free operation and also change the alloca/malloc strategy to a scratch buffer. The testcase uses input argument size that trigger memory leaks on unpatched code, using a scratch buffer the threashold to use heap allocation is lower. Checked on x86_64-linux-gnu and i686-linux-gnu. --- stdio-common/Makefile | 9 ++- stdio-common/tst-printf-bz25691.c | 110 ++++++++++++++++++++++++++++++ stdio-common/vfprintf-internal.c | 108 +++++++++++++---------------- 3 files changed, 163 insertions(+), 64 deletions(-) create mode 100644 stdio-common/tst-printf-bz25691.c diff --git a/stdio-common/Makefile b/stdio-common/Makefile index 95af0c12d7..0a8d66b846 100644 --- a/stdio-common/Makefile +++ b/stdio-common/Makefile @@ -66,6 +66,7 @@ tests := tstscanf test_rdwr test-popen tstgetln test-fseek \ tst-scanf-round \ tst-renameat2 tst-bz11319 tst-bz11319-fortify2 \ scanf14a scanf16a \ + tst-printf-bz25691 test-srcs = tst-unbputc tst-printf tst-printfsz-islongdouble @@ -75,10 +76,12 @@ tests-special += $(objpfx)tst-unbputc.out $(objpfx)tst-printf.out \ $(objpfx)tst-printf-bz18872-mem.out \ $(objpfx)tst-setvbuf1-cmp.out \ $(objpfx)tst-vfprintf-width-prec-mem.out \ - $(objpfx)tst-printfsz-islongdouble.out + $(objpfx)tst-printfsz-islongdouble.out \ + $(objpfx)tst-printf-bz25691-mem.out generated += tst-printf-bz18872.c tst-printf-bz18872.mtrace \ tst-printf-bz18872-mem.out \ - tst-vfprintf-width-prec.mtrace tst-vfprintf-width-prec-mem.out + tst-vfprintf-width-prec.mtrace tst-vfprintf-width-prec-mem.out \ + tst-printf-bz25691.mtrace tst-printf-bz25691-mem.out endif include ../Rules @@ -100,6 +103,8 @@ endif tst-printf-bz18872-ENV = MALLOC_TRACE=$(objpfx)tst-printf-bz18872.mtrace tst-vfprintf-width-prec-ENV = \ MALLOC_TRACE=$(objpfx)tst-vfprintf-width-prec.mtrace +tst-printf-bz25691-ENV = \ + MALLOC_TRACE=$(objpfx)tst-printf-bz25691.mtrace $(objpfx)tst-unbputc.out: tst-unbputc.sh $(objpfx)tst-unbputc $(SHELL) $< $(common-objpfx) '$(test-program-prefix)' > $@; \ diff --git a/stdio-common/tst-printf-bz25691.c b/stdio-common/tst-printf-bz25691.c new file mode 100644 index 0000000000..4ebb91fb44 --- /dev/null +++ b/stdio-common/tst-printf-bz25691.c @@ -0,0 +1,110 @@ +/* Test for memory leak with large width (BZ#25691). + Copyright (C) 2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +static int +do_test (void) +{ + mtrace (); + + /* For 's' conversion specifier with 'l' modifier the array must be + converted to multibyte characters up to the precision specific + value. This might require a heap allocated temporary buffer for + large precisions. */ + { + /* The input size value is to force a heap allocation on temporary + buffer. */ + const size_t winputsize = 64 * 1024 + 1; + wchar_t *winput = xmalloc (winputsize * sizeof (wchar_t)); + wmemset (winput, L'a', winputsize - 1); + winput[winputsize - 1] = L'\0'; + + char result[9]; + const char expected[] = "aaaaaaaa"; + int ret; + + ret = snprintf (result, sizeof (result), "%.65537ls", winput); + TEST_COMPARE (ret, winputsize - 1); + TEST_COMPARE_BLOB (result, sizeof (result), expected, sizeof (expected)); + + ret = snprintf (result, sizeof (result), "%ls", winput); + TEST_COMPARE (ret, winputsize - 1); + TEST_COMPARE_BLOB (result, sizeof (result), expected, sizeof (expected)); + + free (winput); + } + + /* For 's' converstion specifier the array is interpreted as a multibyte + character sequence and converted to wide characters up to the precision + specific value. This might require a heap allocated temporary buffer + for large precisions. */ + { + /* The input size value is to force a heap allocation on temporary + buffer. */ + const size_t mbssize = 32 * 1024; + char *mbs = xmalloc (mbssize); + memset (mbs, 'a', mbssize - 1); + mbs[mbssize - 1] = '\0'; + + const size_t expectedsize = 32 * 1024; + wchar_t *expected = xmalloc (expectedsize * sizeof (wchar_t)); + wmemset (expected, L'a', expectedsize - 1); + expected[expectedsize-1] = L'\0'; + + const size_t resultsize = mbssize * sizeof (wchar_t); + wchar_t *result = xmalloc (resultsize); + int ret; + + ret = swprintf (result, resultsize, L"%.65537s", mbs); + TEST_COMPARE (ret, mbssize - 1); + TEST_COMPARE_BLOB (result, (ret + 1) * sizeof (wchar_t), + expected, expectedsize * sizeof (wchar_t)); + + ret = swprintf (result, resultsize, L"%1$.65537s", mbs); + TEST_COMPARE (ret, mbssize - 1); + TEST_COMPARE_BLOB (result, (ret + 1) * sizeof (wchar_t), + expected, expectedsize * sizeof (wchar_t)); + + /* Same test, but with an invalid multibyte sequence. */ + mbs[mbssize - 2] = 0xff; + + ret = swprintf (result, resultsize, L"%.65537s", mbs); + TEST_COMPARE (ret, -1); + + ret = swprintf (result, resultsize, L"%1$.65537s", mbs); + TEST_COMPARE (ret, -1); + + free (mbs); + free (result); + free (expected); + } + + return 0; +} + +#include diff --git a/stdio-common/vfprintf-internal.c b/stdio-common/vfprintf-internal.c index 3be92d4b6e..4ec246e081 100644 --- a/stdio-common/vfprintf-internal.c +++ b/stdio-common/vfprintf-internal.c @@ -1020,10 +1020,6 @@ static const uint8_t jump_table[] = break; \ \ LABEL (form_string): \ - { \ - size_t len; \ - int string_malloced; \ - \ /* The string argument could in fact be `char *' or `wchar_t *'. \ But this should not make a difference here. */ \ if (fspec == NULL) \ @@ -1032,9 +1028,12 @@ static const uint8_t jump_table[] = string = (CHAR_T *) args_value[fspec->data_arg].pa_wstring; \ \ /* Entry point for printing other strings. */ \ - LABEL (print_string): \ + LABEL (print_string): \ + { \ + size_t len; \ + struct scratch_buffer stringbuf; \ + scratch_buffer_init (&stringbuf); \ \ - string_malloced = 0; \ if (string == NULL) \ { \ /* Write "(null)" if there's space. */ \ @@ -1061,28 +1060,20 @@ static const uint8_t jump_table[] = /* Allocate dynamically an array which definitely is long \ enough for the wide character version. Each byte in the \ multi-byte string can produce at most one wide character. */ \ - if (__glibc_unlikely (len > SIZE_MAX / sizeof (wchar_t))) \ - { \ - __set_errno (EOVERFLOW); \ - done = -1; \ - goto all_done; \ - } \ - else if (__libc_use_alloca (len * sizeof (wchar_t))) \ - string = (CHAR_T *) alloca (len * sizeof (wchar_t)); \ - else if ((string = (CHAR_T *) malloc (len * sizeof (wchar_t))) \ - == NULL) \ + if (scratch_buffer_set_array_size (&stringbuf, len, \ + sizeof (wchar_t))) \ { \ - done = -1; \ - goto all_done; \ + string = stringbuf.data; \ + memset (&mbstate, '\0', sizeof (mbstate_t)); \ + len = __mbsrtowcs (string, &mbs, len, &mbstate); \ } \ - else \ - string_malloced = 1; \ + else \ + len = -1; \ \ - memset (&mbstate, '\0', sizeof (mbstate_t)); \ - len = __mbsrtowcs (string, &mbs, len, &mbstate); \ if (len == (size_t) -1) \ { \ /* Illegal multibyte character. */ \ + scratch_buffer_free (&stringbuf); \ done = -1; \ goto all_done; \ } \ @@ -1100,16 +1091,16 @@ static const uint8_t jump_table[] = if ((width -= len) < 0) \ { \ outstring (string, len); \ - break; \ } \ - \ - if (!left) \ - PAD (L' '); \ - outstring (string, len); \ - if (left) \ - PAD (L' '); \ - if (__glibc_unlikely (string_malloced)) \ - free (string); \ + else \ + { \ + if (!left) \ + PAD (L' '); \ + outstring (string, len); \ + if (left) \ + PAD (L' '); \ + } \ + scratch_buffer_free (&stringbuf); \ } \ break; #else @@ -1156,10 +1147,6 @@ static const uint8_t jump_table[] = break; \ \ LABEL (form_string): \ - { \ - size_t len; \ - int string_malloced; \ - \ /* The string argument could in fact be `char *' or `wchar_t *'. \ But this should not make a difference here. */ \ if (fspec == NULL) \ @@ -1168,9 +1155,12 @@ static const uint8_t jump_table[] = string = (char *) args_value[fspec->data_arg].pa_string; \ \ /* Entry point for printing other strings. */ \ - LABEL (print_string): \ + LABEL (print_string): \ + { \ + size_t len; \ + struct scratch_buffer stringbuf; \ + scratch_buffer_init (&stringbuf); \ \ - string_malloced = 0; \ if (string == NULL) \ { \ /* Write "(null)" if there's space. */ \ @@ -1201,19 +1191,16 @@ static const uint8_t jump_table[] = \ memset (&mbstate, '\0', sizeof (mbstate_t)); \ \ + len = -1; \ if (prec >= 0) \ { \ /* The string `s2' might not be NUL terminated. */ \ - if (__libc_use_alloca (prec)) \ - string = (char *) alloca (prec); \ - else if ((string = (char *) malloc (prec)) == NULL) \ + if (scratch_buffer_set_array_size (&stringbuf, prec, \ + sizeof (char))) \ { \ - done = -1; \ - goto all_done; \ + string = stringbuf.data; \ + len = __wcsrtombs (string, &s2, prec, &mbstate); \ } \ - else \ - string_malloced = 1; \ - len = __wcsrtombs (string, &s2, prec, &mbstate); \ } \ else \ { \ @@ -1222,22 +1209,19 @@ static const uint8_t jump_table[] = { \ assert (__mbsinit (&mbstate)); \ s2 = (const wchar_t *) string; \ - if (__libc_use_alloca (len + 1)) \ - string = (char *) alloca (len + 1); \ - else if ((string = (char *) malloc (len + 1)) == NULL) \ + if (scratch_buffer_set_array_size (&stringbuf, len + 1, \ + sizeof (char))) \ { \ - done = -1; \ - goto all_done; \ + string = stringbuf.data; \ + __wcsrtombs (string, &s2, len + 1, &mbstate); \ } \ - else \ - string_malloced = 1; \ - (void) __wcsrtombs (string, &s2, len + 1, &mbstate); \ } \ } \ \ if (len == (size_t) -1) \ { \ /* Illegal wide-character string. */ \ + scratch_buffer_free (&stringbuf); \ done = -1; \ goto all_done; \ } \ @@ -1246,16 +1230,16 @@ static const uint8_t jump_table[] = if ((width -= len) < 0) \ { \ outstring (string, len); \ - break; \ } \ - \ - if (!left) \ - PAD (' '); \ - outstring (string, len); \ - if (left) \ - PAD (' '); \ - if (__glibc_unlikely (string_malloced)) \ - free (string); \ + else \ + { \ + if (!left) \ + PAD (' '); \ + outstring (string, len); \ + if (left) \ + PAD (' '); \ + } \ + scratch_buffer_free (&stringbuf); \ } \ break; #endif