intl: Proof against invalid offset/length
Commit Message
Hello,
Last year, Jakub Wilk of Debian reported several crashes of msgunfmt
caused by artificial MO files:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=769901
and we fixed them in gettext-tools (msgunfmt uses a different MO file
parser than intl for some reason).
However, I was too lazy to check if that is also the case for
gettext-runtime (libintl) and thus GLIBC, and it is:
$ mkdir -p en/LC_MESSAGES
$ cp gettext/gettext-tools/tests/overflow*.mo en/LC_MESSAGES
$ gdb `which gettext`
(gdb) set environment TEXTDOMAINDIR .
(gdb) set environment LC_ALL en_US.utf8
(gdb) run --domain=overflow-2 foo
Starting program: /usr/bin/gettext --domain=overflow-2 foo
Program received signal SIGSEGV, Segmentation fault.
0x000000365fa94190 in __memcpy_sse2 () from /lib64/libc.so.6
(gdb) bt
#0 0x000000365fa94190 in __memcpy_sse2 () from /lib64/libc.so.6
#1 0x000000365fa30a72 in _nl_load_domain () from /lib64/libc.so.6
#2 0x000000365fa2fb9e in _nl_find_domain () from /lib64/libc.so.6
#3 0x000000365fa2f24c in __dcigettext () from /lib64/libc.so.6
#4 0x000000000040197c in main ()
(gdb)
It is surprising that there are no checks of lengths/offsets read from
MO files. Currently, I'm thinking of the attached patch (to gettext),
which is a bit complicated. If anyone could suggest a cleaner approach,
I'd appreciate it.
Regards,
--
Daiki Ueno
Comments
On 03/11/2015 02:01 AM, Daiki Ueno wrote:
> It is surprising that there are no checks of lengths/offsets read from
> MO files. Currently, I'm thinking of the attached patch (to gettext),
> which is a bit complicated. If anyone could suggest a cleaner approach,
> I'd appreciate it.
Why does it surprise you?
The MO files are writable only by root, so it's not a security issue
because if you could write to them you'd be root, and you'd have
full access to the system anyway.
The other alternative is that the filesystem is corrupted and loading
the MO file crashes your application. This is expected since the
filesystem is corrupted. You are suggesting we add some rather complex
checking for the possibly low probability case of a corrupted
filesystem. If the filesystem is corrupted other things will be failing
and you need to fix the corruption.
What strong technical reasons do you have for propsing these additional
checks?
Cheers,
Carlos.
On 11 Mar 2015 02:39, Carlos O'Donell wrote:
> On 03/11/2015 02:01 AM, Daiki Ueno wrote:
> > It is surprising that there are no checks of lengths/offsets read from
> > MO files. Currently, I'm thinking of the attached patch (to gettext),
> > which is a bit complicated. If anyone could suggest a cleaner approach,
> > I'd appreciate it.
>
> Why does it surprise you?
>
> The MO files are writable only by root, so it's not a security issue
> because if you could write to them you'd be root, and you'd have
> full access to the system anyway.
>
> The other alternative is that the filesystem is corrupted and loading
> the MO file crashes your application. This is expected since the
> filesystem is corrupted. You are suggesting we add some rather complex
> checking for the possibly low probability case of a corrupted
> filesystem. If the filesystem is corrupted other things will be failing
> and you need to fix the corruption.
>
> What strong technical reasons do you have for propsing these additional
> checks?
i thought you could control things via $TEXTDOMAIN/$TEXTDOMAINDIR, but it looks
like just `bash` and `gettext` respect those ? so if you have a shell script
that either directly supports translated messages (e.g. bash's $"..."), or
indirectly (e.g. manually calling `gettext`), and it doesn't lock down the
TEXTDOMAINDIR envvar properly, you could get them to load untrusted data and
crash due to the omitted range checks in glibc ?
i'm not really familiar with how much gettext relies on glibc though or if it
just entirely uses its own copy of code.
using Debian's code search [1], it looks like git provides GIT_TEXTDOMAINDIR to
override the default TEXTDOMAINDIR. i stopped at page ~6 ;).
-mike
[1] http://codesearch.debian.net/perpackage-results/TEXTDOMAINDIR/
Mike Frysinger <vapier@gentoo.org> writes:
>> What strong technical reasons do you have for propsing these additional
>> checks?
>
> i thought you could control things via $TEXTDOMAIN/$TEXTDOMAINDIR, but it looks
> like just `bash` and `gettext` respect those ? so if you have a shell script
> that either directly supports translated messages (e.g. bash's $"..."), or
> indirectly (e.g. manually calling `gettext`), and it doesn't lock down the
> TEXTDOMAINDIR envvar properly, you could get them to load untrusted data and
> crash due to the omitted range checks in glibc ?
bindtextdomain is the only place to configure the location, and it
seems to be the design:
http://thread.gmane.org/gmane.comp.lib.glibc.alpha/575
However, I too observed a few programs which use the location obtained
from environment variable. Perhaps it would be nice to suggest using
the fixed location in the documentation.
Regards,
--
Daiki Ueno
Carlos O'Donell wrote:
> The MO files are writable only by root, so it's not a security issue
> because if you could write to them you'd be root, and you'd have
> full access to the system anyway.
Your argument is similar to Ulrich Drepper's argument: MO files are part
of the distribution of a package, like executables and shared libraries.
glibc does not check against invalid offsets in shared libraries either,
and the kernel does not check against illegal instructions that happen
to exist in executables and shared libraries.
But these arguments don't consider the LANGUAGE variable. The original
intent of LANGUAGE was that it contains colon-separated language or locale
identifiers. But in fact, you can specify relative files names that start
with "../", and thus you can make the _nl_load_domain function in glibc
access files anywhere in the file system. For example:
$ LANGUAGE=../../../../../../../../../../../../../../tmp/hack/crashing-mos strace cp . .
...
open("/usr/share/locale/../../../../../../../../../../../../../../tmp/hack/crashing-mos/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale-langpack/../../../../../../../../../../../../../../tmp/hack/crashing-mos/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
...
If I had put a hacked .mo file at /tmp/hack/crashing-mos/LC_MESSAGES/coreutils.mo
I would have crashed the 'cp' program from coreutils. Likewise with
any program from any package.
Bruno
On 03/12/2015 02:04 AM, Bruno Haible wrote:
> But these arguments don't consider the LANGUAGE variable. The original
> intent of LANGUAGE was that it contains colon-separated language or locale
> identifiers. But in fact, you can specify relative files names that start
> with "../", and thus you can make the _nl_load_domain function in glibc
> access files anywhere in the file system.
Yes, this is bug 17142.
On 03/13/2015 10:29 AM, Florian Weimer wrote:
> On 03/12/2015 02:04 AM, Bruno Haible wrote:
>
>> But these arguments don't consider the LANGUAGE variable. The original
>> intent of LANGUAGE was that it contains colon-separated language or locale
>> identifiers. But in fact, you can specify relative files names that start
>> with "../", and thus you can make the _nl_load_domain function in glibc
>> access files anywhere in the file system.
>
> Yes, this is bug 17142.
In my opinion we need to restrict LANGUAGE just like we restricted all
other the other variables in CVE-2014-0475.
Cheers,
Carlos.
"Carlos O'Donell" <carlos@redhat.com> writes:
> On 03/13/2015 10:29 AM, Florian Weimer wrote:
>> On 03/12/2015 02:04 AM, Bruno Haible wrote:
>>
>>> But these arguments don't consider the LANGUAGE variable. The original
>>> intent of LANGUAGE was that it contains colon-separated language or locale
>>> identifiers. But in fact, you can specify relative files names that start
>>> with "../", and thus you can make the _nl_load_domain function in glibc
>>> access files anywhere in the file system.
>>
>> Yes, this is bug 17142.
>
> In my opinion we need to restrict LANGUAGE just like we restricted all
> other the other variables in CVE-2014-0475.
I agree. Now that intl/ is almost[1] synchronized with gettext, what's
blocking this? I'm happy to include the patch in the upcoming gettext
release so non-glibc consumers also benefit from it.
Footnotes:
[1] except a minor adjustment for non-glibc systems:
https://lists.gnu.org/archive/html/bug-gettext/2015-01/msg00029.html
and absolute pathname handling, which I guess could be reverted because
of not much use:
http://git.savannah.gnu.org/cgit/gettext.git/commit/?id=279b57fc
Regards,
On 03/20/2015 02:06 AM, Daiki Ueno wrote:
> I agree. Now that intl/ is almost[1] synchronized with gettext, what's
> blocking this? I'm happy to include the patch in the upcoming gettext
> release so non-glibc consumers also benefit from it.
The patch will use getauxval(AT_SECURE) or __libc_enable_secure (or
issetuugid on other systems, but which I cannot test). It is not going
to be very portable.
Florian Weimer <fweimer@redhat.com> writes:
> The patch will use getauxval(AT_SECURE) or __libc_enable_secure (or
> issetuugid on other systems, but which I cannot test). It is not going
> to be very portable.
I see (though I'm a bit confused that you removed the use of
__libc_enable_secure in CVE-2014-0475). Can't you use secure_getenv,
which Gnulib provides a replacement, compare the result with
the normal getenv, and apply the pathname check if needed?
Regards,
On 03/21/2015 04:17 AM, Daiki Ueno wrote:
> Florian Weimer <fweimer@redhat.com> writes:
>
>> The patch will use getauxval(AT_SECURE) or __libc_enable_secure (or
>> issetuugid on other systems, but which I cannot test). It is not going
>> to be very portable.
>
> I see (though I'm a bit confused that you removed the use of
> __libc_enable_secure in CVE-2014-0475). Can't you use secure_getenv,
> which Gnulib provides a replacement, compare the result with
> the normal getenv, and apply the pathname check if needed?
Hmm, I was under the impression that absolute paths for LANGUAGE were a
supported feature. If that's not the case, we can just reject directory
traversal and confine lookups to the system locale directory, like we
did for the other locale files.
From 8909a24b692e38472d7c05911f4b65eb24292c66 Mon Sep 17 00:00:00 2001
From: Daiki Ueno <ueno@gnu.org>
Date: Wed, 11 Mar 2015 14:43:34 +0900
Subject: [PATCH] intl: Proof against invalid offset/length
* gettext-runtime/intl/loadmsgcat.c (_nl_load_domain): Check the upper
bound of offset/length values read from MO file.
* gettext-tools/tests/gettext-9: New file.
* gettext-tools/tests/Makefile.am (TESTS): Add new test.
---
gettext-runtime/intl/ChangeLog | 6 ++
gettext-runtime/intl/loadmsgcat.c | 132 ++++++++++++++++++++++++++++++--------
gettext-tools/tests/ChangeLog | 5 ++
gettext-tools/tests/Makefile.am | 2 +-
gettext-tools/tests/gettext-9 | 24 +++++++
5 files changed, 142 insertions(+), 27 deletions(-)
create mode 100755 gettext-tools/tests/gettext-9
@@ -1,3 +1,9 @@
+2015-03-11 Daiki Ueno <ueno@gnu.org>
+
+ intl: Proof against invalid offset/length
+ * loadmsgcat.c (_nl_load_domain): Check the upper bound of
+ offset/length values read from MO file.
+
2015-01-22 Daiki Ueno <ueno@gnu.org>
intl: Update from gnulib
@@ -799,6 +799,8 @@ _nl_load_domain (struct loaded_l10nfile *domain_file,
int revision;
const char *nullentry;
size_t nullentrylen;
+ size_t orig_tab_offset;
+ size_t trans_tab_offset;
__libc_lock_lock_recursive (lock);
if (domain_file->decided != 0)
@@ -921,6 +923,7 @@ _nl_load_domain (struct loaded_l10nfile *domain_file,
domain->mmap_size = size;
domain->must_swap = data->magic != _MAGIC;
domain->malloced = NULL;
+ domain->hash_tab = NULL;
/* Fill in the information about the available tables. */
revision = W (domain->must_swap, data->revision);
@@ -930,16 +933,32 @@ _nl_load_domain (struct loaded_l10nfile *domain_file,
case 0:
case 1:
domain->nstrings = W (domain->must_swap, data->nstrings);
+ orig_tab_offset = W (domain->must_swap, data->orig_tab_offset);
+ if (__builtin_expect (orig_tab_offset >= size, 0))
+ goto invalid;
domain->orig_tab = (const struct string_desc *)
- ((char *) data + W (domain->must_swap, data->orig_tab_offset));
+ ((char *) data + orig_tab_offset);
+ trans_tab_offset = W (domain->must_swap, data->trans_tab_offset);
+ if (__builtin_expect (trans_tab_offset >= size, 0))
+ goto invalid;
domain->trans_tab = (const struct string_desc *)
- ((char *) data + W (domain->must_swap, data->trans_tab_offset));
+ ((char *) data + trans_tab_offset);
domain->hash_size = W (domain->must_swap, data->hash_tab_size);
- domain->hash_tab =
- (domain->hash_size > 2
- ? (const nls_uint32 *)
- ((char *) data + W (domain->must_swap, data->hash_tab_offset))
- : NULL);
+ if (__builtin_expect (domain->hash_size
+ >= SIZE_MAX / sizeof (nls_uint32), 0))
+ goto invalid;
+ if (domain->hash_size > 2)
+ {
+ size_t hash_tab_byte_size = domain->hash_size * sizeof (nls_uint32);
+ size_t hash_tab_offset = W (domain->must_swap, data->hash_tab_offset);
+ if (__builtin_expect (hash_tab_offset
+ >= SIZE_MAX - hash_tab_byte_size, 0)
+ || __builtin_expect (hash_tab_offset + hash_tab_byte_size
+ >= size, 0))
+ goto invalid;
+ domain->hash_tab = (const nls_uint32 *)
+ ((char *) data + hash_tab_offset);
+ }
domain->must_swap_hash_tab = domain->must_swap;
/* Now dispatch on the minor revision. */
@@ -968,6 +987,9 @@ _nl_load_domain (struct loaded_l10nfile *domain_file,
const char **sysdep_segment_values;
const nls_uint32 *orig_sysdep_tab;
const nls_uint32 *trans_sysdep_tab;
+ size_t sysdep_segments_offset;
+ size_t orig_sysdep_tab_offset;
+ size_t trans_sysdep_tab_offset;
nls_uint32 n_inmem_sysdep_strings;
size_t memneed;
char *mem;
@@ -979,20 +1001,30 @@ _nl_load_domain (struct loaded_l10nfile *domain_file,
/* Get the values of the system dependent segments. */
n_sysdep_segments =
W (domain->must_swap, data->n_sysdep_segments);
+ sysdep_segments_offset =
+ W (domain->must_swap, data->sysdep_segments_offset);
+ if (__builtin_expect (sysdep_segments_offset >= size, 0))
+ goto invalid;
sysdep_segments = (const struct sysdep_segment *)
- ((char *) data
- + W (domain->must_swap, data->sysdep_segments_offset));
+ ((char *) data + sysdep_segments_offset);
sysdep_segment_values =
(const char **)
alloca (n_sysdep_segments * sizeof (const char *));
for (i = 0; i < n_sysdep_segments; i++)
{
- const char *name =
- (char *) data
- + W (domain->must_swap, sysdep_segments[i].offset);
+ const char *name;
+ size_t nameoff =
+ W (domain->must_swap, sysdep_segments[i].offset);
nls_uint32 namelen =
W (domain->must_swap, sysdep_segments[i].length);
+ if (__builtin_expect (namelen >= SIZE_MAX - nameoff, 0)
+ || __builtin_expect (nameoff + namelen >= size, 0))
+ {
+ freea (sysdep_segment_values);
+ goto invalid;
+ }
+ name = (char *) data + nameoff;
if (!(namelen > 0 && name[namelen - 1] == '\0'))
{
freea (sysdep_segment_values);
@@ -1002,12 +1034,24 @@ _nl_load_domain (struct loaded_l10nfile *domain_file,
sysdep_segment_values[i] = get_sysdep_segment_value (name);
}
+ orig_sysdep_tab_offset =
+ W (domain->must_swap, data->orig_sysdep_tab_offset);
+ if (__builtin_expect (orig_sysdep_tab_offset >= size, 0))
+ {
+ freea (sysdep_segment_values);
+ goto invalid;
+ }
orig_sysdep_tab = (const nls_uint32 *)
- ((char *) data
- + W (domain->must_swap, data->orig_sysdep_tab_offset));
+ ((char *) data + orig_sysdep_tab_offset);
+ trans_sysdep_tab_offset =
+ W (domain->must_swap, data->trans_sysdep_tab_offset);
+ if (__builtin_expect (trans_sysdep_tab_offset >= size, 0))
+ {
+ freea (sysdep_segment_values);
+ goto invalid;
+ }
trans_sysdep_tab = (const nls_uint32 *)
- ((char *) data
- + W (domain->must_swap, data->trans_sysdep_tab_offset));
+ ((char *) data + trans_sysdep_tab_offset);
/* Compute the amount of additional memory needed for the
system dependent strings and the augmented hash table.
@@ -1022,22 +1066,52 @@ _nl_load_domain (struct loaded_l10nfile *domain_file,
for (j = 0; j < 2; j++)
{
- const struct sysdep_string *sysdep_string =
- (const struct sysdep_string *)
- ((char *) data
- + W (domain->must_swap,
- j == 0
- ? orig_sysdep_tab[i]
- : trans_sysdep_tab[i]));
+ size_t sysdep_string_offset;
+ const struct sysdep_string *sysdep_string;
+ size_t offset;
size_t need = 0;
- const struct segment_pair *p = sysdep_string->segments;
+ const struct segment_pair *p;
+ sysdep_string_offset = W (domain->must_swap,
+ j == 0
+ ? orig_sysdep_tab[i]
+ : trans_sysdep_tab[i]);
+ if (__builtin_expect (sysdep_string_offset >= size, 0))
+ {
+ freea (sysdep_segment_values);
+ goto invalid;
+ }
+ sysdep_string = (const struct sysdep_string *)
+ ((char *) data + sysdep_string_offset);
+
+ offset = W (domain->must_swap, sysdep_string->offset);
+ if (__builtin_expect (offset >= size, 0))
+ {
+ freea (sysdep_segment_values);
+ goto invalid;
+ }
+
+ p = sysdep_string->segments;
if (W (domain->must_swap, p->sysdepref) != SEGMENTS_END)
for (p = sysdep_string->segments;; p++)
{
+ nls_uint32 segsize = W (domain->must_swap,
+ p->segsize);
nls_uint32 sysdepref;
+ size_t n;
- need += W (domain->must_swap, p->segsize);
+ if (__builtin_expect (segsize
+ >= SIZE_MAX - offset, 0)
+ || __builtin_expect (offset + segsize
+ >= size, 0)
+ || __builtin_expect (segsize
+ >= SIZE_MAX - need, 0))
+ {
+ freea (sysdep_segment_values);
+ goto invalid;
+ }
+
+ need += segsize;
sysdepref = W (domain->must_swap, p->sysdepref);
if (sysdepref == SEGMENTS_END)
@@ -1057,7 +1131,13 @@ _nl_load_domain (struct loaded_l10nfile *domain_file,
break;
}
- need += strlen (sysdep_segment_values[sysdepref]);
+ n = strlen (sysdep_segment_values[sysdepref]);
+ if (__builtin_expect (n >= SIZE_MAX - need, 0))
+ {
+ freea (sysdep_segment_values);
+ goto invalid;
+ }
+ need += n;
}
needs[j] = need;
@@ -1,3 +1,8 @@
+2015-03-11 Daiki Ueno <ueno@gnu.org>
+
+ * gettext-9: New file.
+ * Makefile.am (TESTS): Add new test.
+
2015-03-05 Daiki Ueno <ueno@gnu.org>
* format-kde-kuit-1: New file.
@@ -21,7 +21,7 @@ EXTRA_DIST =
MOSTLYCLEANFILES = core *.stackdump
TESTS = gettext-1 gettext-2 gettext-3 gettext-4 gettext-5 gettext-6 gettext-7 \
- gettext-8 \
+ gettext-8 gettext-9 \
msgattrib-1 msgattrib-2 msgattrib-3 msgattrib-4 msgattrib-5 \
msgattrib-6 msgattrib-7 msgattrib-8 msgattrib-9 msgattrib-10 \
msgattrib-11 msgattrib-12 msgattrib-13 msgattrib-14 msgattrib-15 \
new file mode 100755
@@ -0,0 +1,24 @@
+#! /bin/sh
+. "${srcdir=.}/init.sh"; path_prepend_ . ../src
+
+# Test invalid or incomplete input
+
+: ${LOCALE_FR=fr_FR}
+{ test $LOCALE_FR != none && LC_ALL=$LOCALE_FR ../testlocale; } || {
+ if test -f /usr/bin/localedef; then
+ echo "Skipping test: no traditional french locale is installed"
+ else
+ echo "Skipping test: no traditional french locale is supported"
+ fi
+ exit 77
+}
+
+: ${GETTEXT=gettext}
+
+test -d fr || mkdir fr
+test -d fr/LC_MESSAGES || mkdir fr/LC_MESSAGES
+
+for n in 1 2 3 4 5 6; do
+ cp "$abs_srcdir"/overflow-$n.mo fr/LC_MESSAGES
+ echo LANGUAGE= LC_ALL=${LOCALE_FR} TEXTDOMAINDIR=. ${GETTEXT} -d overflow-$n foo
+done
--
2.1.0