From patchwork Wed Mar 11 06:01:36 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daiki Ueno X-Patchwork-Id: 5566 Received: (qmail 17706 invoked by alias); 11 Mar 2015 06:01:46 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 17688 invoked by uid 89); 11 Mar 2015 06:01:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=BAYES_05, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Message-ID: From: Daiki Ueno To: bug-gettext@gnu.org Cc: Jakub Wilk , libc-alpha@sourceware.org Subject: intl: Proof against invalid offset/length Date: Wed, 11 Mar 2015 15:01:36 +0900 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 Hello, Last year, Jakub Wilk of Debian reported several crashes of msgunfmt caused by artificial MO files: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=769901 and we fixed them in gettext-tools (msgunfmt uses a different MO file parser than intl for some reason). However, I was too lazy to check if that is also the case for gettext-runtime (libintl) and thus GLIBC, and it is: $ mkdir -p en/LC_MESSAGES $ cp gettext/gettext-tools/tests/overflow*.mo en/LC_MESSAGES $ gdb `which gettext` (gdb) set environment TEXTDOMAINDIR . (gdb) set environment LC_ALL en_US.utf8 (gdb) run --domain=overflow-2 foo Starting program: /usr/bin/gettext --domain=overflow-2 foo Program received signal SIGSEGV, Segmentation fault. 0x000000365fa94190 in __memcpy_sse2 () from /lib64/libc.so.6 (gdb) bt #0 0x000000365fa94190 in __memcpy_sse2 () from /lib64/libc.so.6 #1 0x000000365fa30a72 in _nl_load_domain () from /lib64/libc.so.6 #2 0x000000365fa2fb9e in _nl_find_domain () from /lib64/libc.so.6 #3 0x000000365fa2f24c in __dcigettext () from /lib64/libc.so.6 #4 0x000000000040197c in main () (gdb) It is surprising that there are no checks of lengths/offsets read from MO files. Currently, I'm thinking of the attached patch (to gettext), which is a bit complicated. If anyone could suggest a cleaner approach, I'd appreciate it. Regards, --- Daiki Ueno From 8909a24b692e38472d7c05911f4b65eb24292c66 Mon Sep 17 00:00:00 2001 From: Daiki Ueno Date: Wed, 11 Mar 2015 14:43:34 +0900 Subject: [PATCH] intl: Proof against invalid offset/length * gettext-runtime/intl/loadmsgcat.c (_nl_load_domain): Check the upper bound of offset/length values read from MO file. * gettext-tools/tests/gettext-9: New file. * gettext-tools/tests/Makefile.am (TESTS): Add new test. --- gettext-runtime/intl/ChangeLog | 6 ++ gettext-runtime/intl/loadmsgcat.c | 132 ++++++++++++++++++++++++++++++-------- gettext-tools/tests/ChangeLog | 5 ++ gettext-tools/tests/Makefile.am | 2 +- gettext-tools/tests/gettext-9 | 24 +++++++ 5 files changed, 142 insertions(+), 27 deletions(-) create mode 100755 gettext-tools/tests/gettext-9 diff --git a/gettext-runtime/intl/ChangeLog b/gettext-runtime/intl/ChangeLog index d32774c..ed0a14d 100644 --- a/gettext-runtime/intl/ChangeLog +++ b/gettext-runtime/intl/ChangeLog @@ -1,3 +1,9 @@ +2015-03-11 Daiki Ueno + + intl: Proof against invalid offset/length + * loadmsgcat.c (_nl_load_domain): Check the upper bound of + offset/length values read from MO file. + 2015-01-22 Daiki Ueno intl: Update from gnulib diff --git a/gettext-runtime/intl/loadmsgcat.c b/gettext-runtime/intl/loadmsgcat.c index 8eb77d8..b97b316 100644 --- a/gettext-runtime/intl/loadmsgcat.c +++ b/gettext-runtime/intl/loadmsgcat.c @@ -799,6 +799,8 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, int revision; const char *nullentry; size_t nullentrylen; + size_t orig_tab_offset; + size_t trans_tab_offset; __libc_lock_lock_recursive (lock); if (domain_file->decided != 0) @@ -921,6 +923,7 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, domain->mmap_size = size; domain->must_swap = data->magic != _MAGIC; domain->malloced = NULL; + domain->hash_tab = NULL; /* Fill in the information about the available tables. */ revision = W (domain->must_swap, data->revision); @@ -930,16 +933,32 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, case 0: case 1: domain->nstrings = W (domain->must_swap, data->nstrings); + orig_tab_offset = W (domain->must_swap, data->orig_tab_offset); + if (__builtin_expect (orig_tab_offset >= size, 0)) + goto invalid; domain->orig_tab = (const struct string_desc *) - ((char *) data + W (domain->must_swap, data->orig_tab_offset)); + ((char *) data + orig_tab_offset); + trans_tab_offset = W (domain->must_swap, data->trans_tab_offset); + if (__builtin_expect (trans_tab_offset >= size, 0)) + goto invalid; domain->trans_tab = (const struct string_desc *) - ((char *) data + W (domain->must_swap, data->trans_tab_offset)); + ((char *) data + trans_tab_offset); domain->hash_size = W (domain->must_swap, data->hash_tab_size); - domain->hash_tab = - (domain->hash_size > 2 - ? (const nls_uint32 *) - ((char *) data + W (domain->must_swap, data->hash_tab_offset)) - : NULL); + if (__builtin_expect (domain->hash_size + >= SIZE_MAX / sizeof (nls_uint32), 0)) + goto invalid; + if (domain->hash_size > 2) + { + size_t hash_tab_byte_size = domain->hash_size * sizeof (nls_uint32); + size_t hash_tab_offset = W (domain->must_swap, data->hash_tab_offset); + if (__builtin_expect (hash_tab_offset + >= SIZE_MAX - hash_tab_byte_size, 0) + || __builtin_expect (hash_tab_offset + hash_tab_byte_size + >= size, 0)) + goto invalid; + domain->hash_tab = (const nls_uint32 *) + ((char *) data + hash_tab_offset); + } domain->must_swap_hash_tab = domain->must_swap; /* Now dispatch on the minor revision. */ @@ -968,6 +987,9 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, const char **sysdep_segment_values; const nls_uint32 *orig_sysdep_tab; const nls_uint32 *trans_sysdep_tab; + size_t sysdep_segments_offset; + size_t orig_sysdep_tab_offset; + size_t trans_sysdep_tab_offset; nls_uint32 n_inmem_sysdep_strings; size_t memneed; char *mem; @@ -979,20 +1001,30 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, /* Get the values of the system dependent segments. */ n_sysdep_segments = W (domain->must_swap, data->n_sysdep_segments); + sysdep_segments_offset = + W (domain->must_swap, data->sysdep_segments_offset); + if (__builtin_expect (sysdep_segments_offset >= size, 0)) + goto invalid; sysdep_segments = (const struct sysdep_segment *) - ((char *) data - + W (domain->must_swap, data->sysdep_segments_offset)); + ((char *) data + sysdep_segments_offset); sysdep_segment_values = (const char **) alloca (n_sysdep_segments * sizeof (const char *)); for (i = 0; i < n_sysdep_segments; i++) { - const char *name = - (char *) data - + W (domain->must_swap, sysdep_segments[i].offset); + const char *name; + size_t nameoff = + W (domain->must_swap, sysdep_segments[i].offset); nls_uint32 namelen = W (domain->must_swap, sysdep_segments[i].length); + if (__builtin_expect (namelen >= SIZE_MAX - nameoff, 0) + || __builtin_expect (nameoff + namelen >= size, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } + name = (char *) data + nameoff; if (!(namelen > 0 && name[namelen - 1] == '\0')) { freea (sysdep_segment_values); @@ -1002,12 +1034,24 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, sysdep_segment_values[i] = get_sysdep_segment_value (name); } + orig_sysdep_tab_offset = + W (domain->must_swap, data->orig_sysdep_tab_offset); + if (__builtin_expect (orig_sysdep_tab_offset >= size, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } orig_sysdep_tab = (const nls_uint32 *) - ((char *) data - + W (domain->must_swap, data->orig_sysdep_tab_offset)); + ((char *) data + orig_sysdep_tab_offset); + trans_sysdep_tab_offset = + W (domain->must_swap, data->trans_sysdep_tab_offset); + if (__builtin_expect (trans_sysdep_tab_offset >= size, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } trans_sysdep_tab = (const nls_uint32 *) - ((char *) data - + W (domain->must_swap, data->trans_sysdep_tab_offset)); + ((char *) data + trans_sysdep_tab_offset); /* Compute the amount of additional memory needed for the system dependent strings and the augmented hash table. @@ -1022,22 +1066,52 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, for (j = 0; j < 2; j++) { - const struct sysdep_string *sysdep_string = - (const struct sysdep_string *) - ((char *) data - + W (domain->must_swap, - j == 0 - ? orig_sysdep_tab[i] - : trans_sysdep_tab[i])); + size_t sysdep_string_offset; + const struct sysdep_string *sysdep_string; + size_t offset; size_t need = 0; - const struct segment_pair *p = sysdep_string->segments; + const struct segment_pair *p; + sysdep_string_offset = W (domain->must_swap, + j == 0 + ? orig_sysdep_tab[i] + : trans_sysdep_tab[i]); + if (__builtin_expect (sysdep_string_offset >= size, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } + sysdep_string = (const struct sysdep_string *) + ((char *) data + sysdep_string_offset); + + offset = W (domain->must_swap, sysdep_string->offset); + if (__builtin_expect (offset >= size, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } + + p = sysdep_string->segments; if (W (domain->must_swap, p->sysdepref) != SEGMENTS_END) for (p = sysdep_string->segments;; p++) { + nls_uint32 segsize = W (domain->must_swap, + p->segsize); nls_uint32 sysdepref; + size_t n; - need += W (domain->must_swap, p->segsize); + if (__builtin_expect (segsize + >= SIZE_MAX - offset, 0) + || __builtin_expect (offset + segsize + >= size, 0) + || __builtin_expect (segsize + >= SIZE_MAX - need, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } + + need += segsize; sysdepref = W (domain->must_swap, p->sysdepref); if (sysdepref == SEGMENTS_END) @@ -1057,7 +1131,13 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, break; } - need += strlen (sysdep_segment_values[sysdepref]); + n = strlen (sysdep_segment_values[sysdepref]); + if (__builtin_expect (n >= SIZE_MAX - need, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } + need += n; } needs[j] = need; diff --git a/gettext-tools/tests/ChangeLog b/gettext-tools/tests/ChangeLog index 72a46d4..9983a14 100644 --- a/gettext-tools/tests/ChangeLog +++ b/gettext-tools/tests/ChangeLog @@ -1,3 +1,8 @@ +2015-03-11 Daiki Ueno + + * gettext-9: New file. + * Makefile.am (TESTS): Add new test. + 2015-03-05 Daiki Ueno * format-kde-kuit-1: New file. diff --git a/gettext-tools/tests/Makefile.am b/gettext-tools/tests/Makefile.am index a4decaf..7c995e3 100644 --- a/gettext-tools/tests/Makefile.am +++ b/gettext-tools/tests/Makefile.am @@ -21,7 +21,7 @@ EXTRA_DIST = MOSTLYCLEANFILES = core *.stackdump TESTS = gettext-1 gettext-2 gettext-3 gettext-4 gettext-5 gettext-6 gettext-7 \ - gettext-8 \ + gettext-8 gettext-9 \ msgattrib-1 msgattrib-2 msgattrib-3 msgattrib-4 msgattrib-5 \ msgattrib-6 msgattrib-7 msgattrib-8 msgattrib-9 msgattrib-10 \ msgattrib-11 msgattrib-12 msgattrib-13 msgattrib-14 msgattrib-15 \ diff --git a/gettext-tools/tests/gettext-9 b/gettext-tools/tests/gettext-9 new file mode 100755 index 0000000..7d6d4c8 --- /dev/null +++ b/gettext-tools/tests/gettext-9 @@ -0,0 +1,24 @@ +#! /bin/sh +. "${srcdir=.}/init.sh"; path_prepend_ . ../src + +# Test invalid or incomplete input + +: ${LOCALE_FR=fr_FR} +{ test $LOCALE_FR != none && LC_ALL=$LOCALE_FR ../testlocale; } || { + if test -f /usr/bin/localedef; then + echo "Skipping test: no traditional french locale is installed" + else + echo "Skipping test: no traditional french locale is supported" + fi + exit 77 +} + +: ${GETTEXT=gettext} + +test -d fr || mkdir fr +test -d fr/LC_MESSAGES || mkdir fr/LC_MESSAGES + +for n in 1 2 3 4 5 6; do + cp "$abs_srcdir"/overflow-$n.mo fr/LC_MESSAGES + echo LANGUAGE= LC_ALL=${LOCALE_FR} TEXTDOMAINDIR=. ${GETTEXT} -d overflow-$n foo +done -- 2.1.0