Set behavior of sprintf-like functions with overlapping source and destination
Commit Message
According to ISO C99, passing the same buffer as source and destination
to sprintf, snprintf, vsprintf, or vsnprintf has undefined behavior.
Until the commit
commit 4e2f43f842ef5e253cc23383645adbaa03cedb86
Author: Zack Weinberg <zackw@panix.com>
Date: Wed Mar 7 14:32:03 2018 -0500
Use PRINTF_FORTIFY instead of _IO_FLAGS2_FORTIFY (bug 11319)
a call to sprintf or vsprintf with overlapping buffers, for instance
vsprintf (buf, "%sTEXT", buf), would append `TEXT' into buf, while a
call to snprintf or vsnprintf would override the contents of buf.
After the aforementioned commit, the behavior of sprintf and vsprintf
changed (so that they also override the contents of buf).
This patch reverts this behavioral change, because it will likely break
applications that rely on the previous behavior, even though it is
undefined by ISO C. As noted by Szabolcs Nagy, this is used in SPEC2017
507.cactuBSSN_r/src/PUGH/PughUtils.c:
sprintf(mess," Size:");
for (i=0;i<dim+1;i++)
{
sprintf(mess,"%s %d",mess,pughGH->GFExtras[dim]->nsize[i]);
}
More important to notice is the fact that the overwriting of the
destination buffer is not the only behavior affected by the refactoring.
Before the refactoring, sprintf and vsprintf would use _IO_str_jumps,
whereas __sprintf_chk and __vsprintf_chk would use _IO_str_chk_jumps.
After the refactoring, all use _IO_str_chk_jumps, which would make
sprintf and vsprintf report buffer overflows and terminate the program.
This patch also reverts this behavior, by installing the appropriate
jump table for each *sprintf functions.
Apart from reverting the changes, this patch adds a test case that has
the old behavior hardcoded, so that regressions are noticed if something
else unintentionally changes the behavior.
Tested for powerpc64le.
* debug/sprintf_chk.c (___sprintf_chk): Use PRINTF_CHK.
* debug/vsprintf_chk.c (___vsprintf_chk): Likewise.
* libio/Makefile (tests): Add tst-sprintf-ub and
tst-sprintf-chk-ub.
CFLAGS-tst-sprintf-ub.c: New variable.
CFLAGS-tst-sprintf-chk-ub.c: Likewise.
* libio/iovsprintf.c (__vsprintf_internal): Only erase the
destination buffer and check for overflows in fortified mode.
* libio/libioP.h (PRINTF_CHK): New macro.
* libio/tst-sprintf-chk-ub.c: New file.
* libio/tst-sprintf-ub.c: Likewise.
---
debug/sprintf_chk.c | 4 ++
debug/vsprintf_chk.c | 4 ++
libio/Makefile | 7 +++-
libio/iovsprintf.c | 14 ++++++-
libio/libioP.h | 6 ++-
libio/tst-sprintf-chk-ub.c | 2 +
libio/tst-sprintf-ub.c | 102 +++++++++++++++++++++++++++++++++++++++++++++
7 files changed, 135 insertions(+), 4 deletions(-)
create mode 100644 libio/tst-sprintf-chk-ub.c
create mode 100644 libio/tst-sprintf-ub.c
Comments
Dumb question: if fortification is enabled, why can't sprintf-like
functions report an error when the source and destination overlap? The
point of fortification is to catch and report undefined behavior when
it's easy, as is the case here.
> /* Test the sprintf (buf, "%s", buf) does not override buf.
I'm leery of adding this test case, as it tests undefined behavior that
the glibc manual does not document as an extension (and it shouldn't be
documented either).
Traditionally we didn't worry about breaking code like PughUtils.c's
'sprintf(mess,"%s %d",mess,...)' under the principle that such code was
already broken. Why depart from that tradition here?
On Thu, 20 Dec 2018, Paul Eggert wrote:
>Dumb question: if fortification is enabled, why can't sprintf-like
>functions report an error when the source and destination overlap? The
>point of fortification is to catch and report undefined behavior when
>it's easy, as is the case here.
With fortification enabled, they still do. This change is more about the
non-fortified case...
Before commit ID 4e2f43f842ef, non-fortified sprintf and vsprintf did
*not*:
1. report buffer overflows and terminate the program;
2. overwrite overlapping buffers.
But since commit ID 4e2f43f842ef, they do (I haven't noticed these changes
when working on that patch, now they are being questioned here:
https://sourceware.org/ml/libc-alpha/2018-12/msg00634.html and
https://sourceware.org/ml/libc-alpha/2018-12/msg00839.html).
This new patch is about reverting these two items for the non-fortified
case (maybe they should be considered separately).
>> /* Test the sprintf (buf, "%s", buf) does not override buf.
>
>I'm leery of adding this test case, as it tests undefined behavior that
>the glibc manual does not document as an extension (and it shouldn't be
>documented either).
>
>Traditionally we didn't worry about breaking code like PughUtils.c's
>'sprintf(mess,"%s %d",mess,...)' under the principle that such code was
>already broken. Why depart from that tradition here?
I don't know how to answer that... I'm sort of a rookie, and I wasn't
here to witness past, similar changes and what was the fallout. Maybe
other people have better, backed opinions about it...
On Thu, 20 Dec 2018, Gabriel F. T. Gomes wrote:
> >Traditionally we didn't worry about breaking code like PughUtils.c's
> >'sprintf(mess,"%s %d",mess,...)' under the principle that such code was
> >already broken. Why depart from that tradition here?
>
> I don't know how to answer that... I'm sort of a rookie, and I wasn't
> here to witness past, similar changes and what was the fallout. Maybe
> other people have better, backed opinions about it...
We have the example of x86_64 memcpy (where a GLIBC_2.14 version was added
to avoid breaking old *binaries* expecting it to have memmove semantics,
but in that case breaking *sources* expecting overlapping copies to work
was considered fine).
* Joseph Myers:
> On Thu, 20 Dec 2018, Gabriel F. T. Gomes wrote:
>
>> >Traditionally we didn't worry about breaking code like PughUtils.c's
>> >'sprintf(mess,"%s %d",mess,...)' under the principle that such code was
>> >already broken. Why depart from that tradition here?
>>
>> I don't know how to answer that... I'm sort of a rookie, and I wasn't
>> here to witness past, similar changes and what was the fallout. Maybe
>> other people have better, backed opinions about it...
>
> We have the example of x86_64 memcpy (where a GLIBC_2.14 version was added
> to avoid breaking old *binaries* expecting it to have memmove semantics,
> but in that case breaking *sources* expecting overlapping copies to work
> was considered fine).
The fallout from that was largely negative, though, because
memcpy@GLIBC_2.14 was the only symbol that required the GLIBC_2.14
symbol version for quite some time.
These days, I'd expect us to provide an LD_PRELOAD library instead of a
symbol version (like libBrokenLocale), or wait until something else that
requires a new symbol version for many binaries comes around.
Thanks,
Florian
On 21/12/18 3:47 AM, Paul Eggert wrote:
> Dumb question: if fortification is enabled, why can't sprintf-like
> functions report an error when the source and destination overlap? The
> point of fortification is to catch and report undefined behavior when
> it's easy, as is the case here.
>
>> /* Test the sprintf (buf, "%s", buf) does not override buf.
>
> I'm leery of adding this test case, as it tests undefined behavior that
> the glibc manual does not document as an extension (and it shouldn't be
> documented either).
>
> Traditionally we didn't worry about breaking code like PughUtils.c's
> 'sprintf(mess,"%s %d",mess,...)' under the principle that such code was
> already broken. Why depart from that tradition here?
Is the disagreement here only about testing UB or also about retaining
old behaviour in case of UB? If it's just the former then we could make
forward progress by just removing the UB test case and just keeping the
ub-chk test case.
It may not be too hard for the compiler to see this undefined behaviour
and warn about it either, at least in some trivial cases...
Siddhesh
Siddhesh Poyarekar wrote:
> On 21/12/18 3:47 AM, Paul Eggert wrote:
>>> /* Test the sprintf (buf, "%s", buf) does not override buf.
>>
>> I'm leery of adding this test case, as it tests undefined behavior that the
>> glibc manual does not document as an extension (and it shouldn't be documented
>> either).
>>
>> Traditionally we didn't worry about breaking code like PughUtils.c's
>> 'sprintf(mess,"%s %d",mess,...)' under the principle that such code was
>> already broken. Why depart from that tradition here?
>
> Is the disagreement here only about testing UB or also about retaining old
> behaviour in case of UB?
Primarily the former. I don't want us to test for and/or guarantee support for
this particular implementation of UB. (Although I'm not happy about the extra
code inserted into every printf call to deal with this situation, it's not like
printf is particularly fast now....)
> If it's just the former then we could make forward
> progress by just removing the UB test case and just keeping the ub-chk test case.
>
> It may not be too hard for the compiler to see this undefined behaviour and warn
> about it either, at least in some trivial cases...
Yes, that'd be good.
Paul Eggert wrote:
> Siddhesh Poyarekar wrote:
>> On 21/12/18 3:47 AM, Paul Eggert wrote:
>>> Traditionally we didn't worry about breaking code like PughUtils.c's
>>> 'sprintf(mess,"%s %d",mess,...)' under the principle that such code
>>> was already broken. Why depart from that tradition here?
>>
>> Is the disagreement here only about testing UB or also about retaining
>> old behaviour in case of UB?
>
> Primarily the former. I don't want us to test for and/or guarantee support
> for this particular implementation of UB. (Although I'm not happy about the
> extra code inserted into every printf call to deal with this situation, it's
> not like printf is particularly fast now....)
Can we keep the test and just include a (source) comment describing the
lack of support behind it?
It feels odd to bend over backwards to support a behavior on one hand,
while not having a test for it on the other.
>> If it's just the former then we could make forward progress by just
>> removing the UB test case and just keeping the ub-chk test case.
>>
>> It may not be too hard for the compiler to see this undefined behaviour
>> and warn about it either, at least in some trivial cases...
>
> Yes, that'd be good.
Learning from the GLIBC_2.14 memcpy, I wonder what our best option is.
I wonder if something like the following would be too complex:
1. Introduce GLIBC_2.29 printf that does *not* support this undefined
behavior. Use __asm__(".symver [...]@@GLIBC_2.0") to ensure we
still default to the GLIBC_2.0 version even when building new
programs.
2. After a critical mass of users are using glibc 2.29 or newer (e.g.
after a year), switch the default version to GLIBC_2.29.
What do you think?
Thanks,
Jonathan
Jonathan Nieder wrote:
> Can we keep the test and just include a (source) comment describing the
> lack of support behind it?
How about if we comment out the test, or enable it only if some weird macro is
already defined?
> It feels odd to bend over backwards to support a behavior on one hand,
> while not having a test for it on the other.
It is a special case, indeed.
> I wonder if something like the following would be too complex:
>
> 1. Introduce GLIBC_2.29 printf that does *not* support this undefined
> behavior. Use __asm__(".symver [...]@@GLIBC_2.0") to ensure we
> still default to the GLIBC_2.0 version even when building new
> programs.
>
> 2. After a critical mass of users are using glibc 2.29 or newer (e.g.
> after a year), switch the default version to GLIBC_2.29.
>
> What do you think?
I'm afraid it sounds confusing, as GLIBC_2.29 wouldn't be the default for glibc
2.29 when it's released. Perhaps the symbol should be GLIBC_UNSTABLE instead?
But even then, I don't see why users would use the new version before the
default was switched, so if there are issues we won't find them any more gently
with this approach.
On 27/12/18 12:26 AM, Paul Eggert wrote:
>> It may not be too hard for the compiler to see this undefined
>> behaviour and warn about it either, at least in some trivial cases...
>
> Yes, that'd be good.
Turns out the compiler does identify aliases in this case (although in
a very generic way because of which the warning may not be super-clear
to a lot of devs at first glance):
foo.c:9:11: warning: passing argument 1 to restrict-qualified parameter
aliases with argument 3 [-Wrestrict]
sprintf(buf, "%sCD", buf);
^~~ ~~~
So I suppose we have this part covered.
Siddhesh
On 27/12/18 3:52 AM, Paul Eggert wrote:
>> I wonder if something like the following would be too complex:
>>
>> 1. Introduce GLIBC_2.29 printf that does *not* support this undefined
>> behavior. Use __asm__(".symver [...]@@GLIBC_2.0") to ensure we
>> still default to the GLIBC_2.0 version even when building new
>> programs.
>>
>> 2. After a critical mass of users are using glibc 2.29 or newer (e.g.
>> after a year), switch the default version to GLIBC_2.29.
>>
>> What do you think?
>
> I'm afraid it sounds confusing, as GLIBC_2.29 wouldn't be the default
> for glibc 2.29 when it's released. Perhaps the symbol should be
> GLIBC_UNSTABLE instead? But even then, I don't see why users would use
> the new version before the default was switched, so if there are issues
> we won't find them any more gently with this approach.
We should not retain compatible behaviour for building, only for linking
like we normally do. The symbol versioning patch won't get backported
(since it is an ABI event) to older distributions and as long as it is
alongside gcc8+ (I tested 8, the warning might have been introduced
earlier) there is an adequate warning that tells users their houses may
burn down if they depend on this behaviour. Or something like that.
The risk is not completely gone though and there may be environments
(gcc5 for example) where there is no warning. These would be custom
built environments though and I suppose we can depend on these devs to
read the manpage more carefully. Or maybe add more code in the headers
to link against the compat *printf if an older compiler is detected?
It's debatable if all of this is necessary for undefined behaviour...
In any case, this doesn't look like something that'll be done within
this week, so I propose we add this compatibility change in now (without
the UB test) in the interest of not breaking things and then add
versioning to only retain backward compatible behaviour for GLIBC_2.0
printf after discussing at length how much breakage we can stand. Maybe
also get friends at LWN to write about how sprintf using aliased buffers
is asking for trouble.
With this action plan we don't really need to test the undefined
behaviour since we don't intend to retain it except in the compat case,
just flip the switch in the ub-chk test later so that it always does the
check instead of only under _FORTIFY_SOURCE.
Siddhesh
On Wed, Dec 26, 2018 at 8:19 PM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> On 27/12/18 3:52 AM, Paul Eggert wrote:
> >> I wonder if something like the following would be too complex:
> >>
> >> 1. Introduce GLIBC_2.29 printf that does *not* support this undefined
> >> behavior. Use __asm__(".symver [...]@@GLIBC_2.0") to ensure we
> >> still default to the GLIBC_2.0 version even when building new
> >> programs.
> >>
> >> 2. After a critical mass of users are using glibc 2.29 or newer (e.g.
> >> after a year), switch the default version to GLIBC_2.29.
> >>
> >> What do you think?
> >
> > I'm afraid it sounds confusing, as GLIBC_2.29 wouldn't be the default
> > for glibc 2.29 when it's released. Perhaps the symbol should be
> > GLIBC_UNSTABLE instead? But even then, I don't see why users would use
> > the new version before the default was switched, so if there are issues
> > we won't find them any more gently with this approach.
>
> We should not retain compatible behaviour for building, only for linking
> like we normally do. The symbol versioning patch won't get backported
> (since it is an ABI event) to older distributions and as long as it is
> alongside gcc8+ (I tested 8, the warning might have been introduced
> earlier) there is an adequate warning that tells users their houses may
> burn down if they depend on this behaviour. Or something like that.
Despite having written the patch that broke the old behavior, I think
this is much too aggressive. The fact that we almost immediately
discovered breakage after the patch landed means there are probably a
whole lot of programs out there relying on it, and I don't think it's
safe to assume people will pay attention to warnings _or_ read
documentation. Witness how people are _still_ complaining about the
memcpy change.
I'm inclined to say that this degree of freedom is now frozen and we
need to accept that the old behavior has become a supported GNU
extension and we should document it as such, test for it, etc. Not a
good extension, but one we are stuck with. Failing that, I think we
need to preserve the old behavior for at least one more full release
and we need to announce as loudly and widely as possible that we are
changing it. If we do change it, we should also make sure that the
new behavior is well-defined and tested for all cases of overlapping
buffers, and what the new behavior is must be documented, and we need
to stick to it.
zw
On 27/12/18 7:45 AM, Zack Weinberg wrote:
> Despite having written the patch that broke the old behavior, I think
> this is much too aggressive. The fact that we almost immediately
> discovered breakage after the patch landed means there are probably a
> whole lot of programs out there relying on it, and I don't think it's
> safe to assume people will pay attention to warnings _or_ read
> documentation. Witness how people are _still_ complaining about the
> memcpy change.
>
> I'm inclined to say that this degree of freedom is now frozen and we
> need to accept that the old behavior has become a supported GNU
> extension and we should document it as such, test for it, etc. Not a
> good extension, but one we are stuck with. Failing that, I think we
> need to preserve the old behavior for at least one more full release
> and we need to announce as loudly and widely as possible that we are
> changing it. If we do change it, we should also make sure that the
> new behavior is well-defined and tested for all cases of overlapping
> buffers, and what the new behavior is must be documented, and we need
> to stick to it.
Thanks, to clarify, is your position that we revert to old behaviour
(for now) for the default case only or for everything, including
_FORTIFY_SOURCE?
Siddhesh
On Thu, Dec 27, 2018 at 11:15:06AM +0530, Siddhesh Poyarekar wrote:
> Thanks, to clarify, is your position that we revert to old behaviour (for
> now) for the default case only or for everything, including _FORTIFY_SOURCE?
The _FORTIFY_SOURCE version did not change behaviour... Even before
Zack's patches, it would overwrite overlapping buffers. This means that
this patch, as it is today, reverts all behavior to what it used to be.
* Jonathan Nieder:
> Learning from the GLIBC_2.14 memcpy,
I assume you've taken my observations in
<https://sourceware.org/ml/libc-alpha/2018-12/msg00871.html>
into account.
> I wonder what our best option is. I wonder if something like the
> following would be too complex:
>
> 1. Introduce GLIBC_2.29 printf that does *not* support this undefined
> behavior. Use __asm__(".symver [...]@@GLIBC_2.0") to ensure we
> still default to the GLIBC_2.0 version even when building new
> programs.
>
> 2. After a critical mass of users are using glibc 2.29 or newer (e.g.
> after a year), switch the default version to GLIBC_2.29.
>
> What do you think?
So you want to introduce a compat symbol for GLIBC_2.29, but leave the
default symbol version at whatever version we have today?
Waiting a year (until a 2020 release—how time flies!) will not make a
substantial difference in deployment due to the release schedules of
distributions currently under development and their glibc version
choices. That part is a non-starter, I'm afraid. A multi-year delay
would make a difference, of course, but I'm not sure if that's
appropriate. I think the most feasible course of action would be to
bundle several such changes together, so that the impact is felt only
once.
Historically, we have used symbol versioning (in the sense that we
changed the default symbol version) for two different situations:
(1) A type changed size or a function changed its prototype in a
binary-incompatible way, usually prompted by a desire to increase
standards conformance.
(2) A function changed behavior and existing software broke.
memcpy@GLIBC_2.14 is clearly in the second category. A more recent
example is glob@GLIBC_2.27.
The key difference between (1) and (2) is that rebuilding existing
software for (1) either works as before, or you get at a compiler
error and have to apply a simple fix. For (2), on the other hand, the
bug is still there, and recompilation reintroduces it.
Over the years, I have come to the realization that (2) really only
benefits proprietary software, and unmaintained proprietary software
at that. We still receive bug reports about the glob@GLIBC_2.27
transition because people try to build the wrong version of GNU make
on glibc 2.27 or later. I would say that using symbol versioning to
make such backwards-incompatible changes is very confusing and may not
be worth it. Instead, we should make the change without introducing a
symbol version (to treat everyone the same), or not make the change at
all.
The sprintf change clearly is in category (2). However, there's a
mitigation circumstance: Distributions already build with
_FORTIFY_SOURCE (at least they try to), so they are largely exposed to
the new behavior. So I'd expect that in this particular case, the
recompilation problem would largely affect unpackaged, in-house
software. Maybe this makes the change more acceptable and could
actually justify introducing a symbol version in this case?
On Thu, Dec 27, 2018 at 8:46 AM Gabriel F. T. Gomes
<gabriel@inconstante.eti.br> wrote:
> On Thu, Dec 27, 2018 at 11:15:06AM +0530, Siddhesh Poyarekar wrote:
> > Thanks, to clarify, is your position that we revert to old behaviour (for
> > now) for the default case only or for everything, including _FORTIFY_SOURCE?
>
> The _FORTIFY_SOURCE version did not change behaviour... Even before
> Zack's patches, it would overwrite overlapping buffers. This means that
> this patch, as it is today, reverts all behavior to what it used to be.
Yes. My position right now is that we should merge Gabriel's patch as
is -- including the test case! -- for 2.29, and consider whether we
want to make any further changes after the release.
zw
On Thu, Dec 27, 2018 at 10:27 AM Florian Weimer <fw@deneb.enyo.de> wrote:
> Historically, we have used symbol versioning (in the sense that we
> changed the default symbol version) for two different situations:
>
> (1) A type changed size or a function changed its prototype in a
> binary-incompatible way, usually prompted by a desire to increase
> standards conformance.
>
> (2) A function changed behavior and existing software broke.
>
> memcpy@GLIBC_2.14 is clearly in the second category. A more recent
> example is glob@GLIBC_2.27.
>
> The key difference between (1) and (2) is that rebuilding existing
> software for (1) either works as before, or you get at a compiler
> error and have to apply a simple fix. For (2), on the other hand, the
> bug is still there, and recompilation reintroduces it.
>
> Over the years, I have come to the realization that (2) really only
> benefits proprietary software, and unmaintained proprietary software
> at that. We still receive bug reports about the glob@GLIBC_2.27
> transition because people try to build the wrong version of GNU make
> on glibc 2.27 or later. I would say that using symbol versioning to
> make such backwards-incompatible changes is very confusing and may not
> be worth it. Instead, we should make the change without introducing a
> symbol version (to treat everyone the same), or not make the change at
> all.
I agree with this and I also think we have historically been too
aggressive about breaking programs that depend on behavior that is
formally undefined in the C standard but has been well-defined for
many years in the GNU ecosystem. In this case I am particularly
concerned about breaking programs that are distributed as source but
not carefully maintained. It's really easy to miss one more warning
in a program that already has lots.
> The sprintf change clearly is in category (2). However, there's a
> mitigation circumstance: Distributions already build with
> _FORTIFY_SOURCE (at least they try to), so they are largely exposed to
> the new behavior. So I'd expect that in this particular case, the
> recompilation problem would largely affect unpackaged, in-house
> software. Maybe this makes the change more acceptable and could
> actually justify introducing a symbol version in this case?
It seems to me it's the other way around: distribution-packaged
software is more likely to receive fixes for this type of problem than
unpackaged in-house stuff is.
zw
* Zack Weinberg:
>> The sprintf change clearly is in category (2). However, there's a
>> mitigation circumstance: Distributions already build with
>> _FORTIFY_SOURCE (at least they try to), so they are largely exposed to
>> the new behavior. So I'd expect that in this particular case, the
>> recompilation problem would largely affect unpackaged, in-house
>> software. Maybe this makes the change more acceptable and could
>> actually justify introducing a symbol version in this case?
>
> It seems to me it's the other way around: distribution-packaged
> software is more likely to receive fixes for this type of problem than
> unpackaged in-house stuff is.
On common code paths, yes, but the same applies to in-house code.
What I tried to say is that the distribution fixes have already
happened due to the impact of -D_FORTIFY_SOURCE=2 by default and
hopefully have been upstreamed by now. (This may have given us the
delay that Jonathan was suggesting.) As a result, we wouldn't punish
free software over proprietary software in this case. Sorry if I
didn't explain my line of reasoning clearly enough.
On 27/12/18 10:29 PM, Zack Weinberg wrote:
> On Thu, Dec 27, 2018 at 8:46 AM Gabriel F. T. Gomes
> <gabriel@inconstante.eti.br> wrote:
>> On Thu, Dec 27, 2018 at 11:15:06AM +0530, Siddhesh Poyarekar wrote:
>>> Thanks, to clarify, is your position that we revert to old behaviour (for
>>> now) for the default case only or for everything, including _FORTIFY_SOURCE?
>>
>> The _FORTIFY_SOURCE version did not change behaviour... Even before
>> Zack's patches, it would overwrite overlapping buffers. This means that
>> this patch, as it is today, reverts all behavior to what it used to be.
>
> Yes. My position right now is that we should merge Gabriel's patch as
> is -- including the test case! -- for 2.29, and consider whether we
> want to make any further changes after the release.
OK thaks for the clarification. Paul, do you have a strong opposition
to this or is it OK if we take this course for this release?
Siddhesh
On Thu, 27 Dec 2018, Siddhesh Poyarekar wrote:
> With this action plan we don't really need to test the undefined behaviour
> since we don't intend to retain it except in the compat case, just flip the
Compat symbol semantics should be tested in the testsuite (with
appropriate TEST_COMPAT conditionals and use of compat_symbol_reference).
(There are exceptional cases where the new symbol's semantics are a
refinement of the old and so the two symbol versions alias each other and
no separate test of the old version is useful, but a new version is added
to stop new binaries working with old libc - e.g. when some fenv.h
functions changed from void to int return type following C99TC1 - but the
normal case is that the compat symbols should be tested to ensure they
keep working as expected.)
On 31/12/18 11:16 PM, Joseph Myers wrote:
> Compat symbol semantics should be tested in the testsuite (with
> appropriate TEST_COMPAT conditionals and use of compat_symbol_reference).
> (There are exceptional cases where the new symbol's semantics are a
> refinement of the old and so the two symbol versions alias each other and
> no separate test of the old version is useful, but a new version is added
> to stop new binaries working with old libc - e.g. when some fenv.h
> functions changed from void to int return type following C99TC1 - but the
> normal case is that the compat symbols should be tested to ensure they
> keep working as expected.)
That's a good point, and something to keep in mind when we revive this
discussion for 2.30. For 2.29, do you agree that we should revert to
old behaviour?
Siddhesh
On Thu, 27 Dec 2018, Florian Weimer wrote:
> The key difference between (1) and (2) is that rebuilding existing
> software for (1) either works as before, or you get at a compiler
> error and have to apply a simple fix. For (2), on the other hand, the
> bug is still there, and recompilation reintroduces it.
Recompilation reintroduces it *if the software doesn't have tests that
expose the issue*. If the software has sufficient tests, (2) reduces to
(1) - recompilation produces a (test) error. (And software is more likely
to have tests run when it is recompiled than when glibc is updated without
the software being recompiled - there are of course the differences here
between GNU/Linux distributions as to whether packaged software without
any changes gets rebuilt in bulk for each distribution release.)
On Mon, 31 Dec 2018, Siddhesh Poyarekar wrote:
> That's a good point, and something to keep in mind when we revive this
> discussion for 2.30. For 2.29, do you agree that we should revert to old
> behaviour?
Yes (and make sure it's tested in the testsuite).
@@ -29,6 +29,10 @@ ___sprintf_chk (char *s, int flag, size_t slen, const char *format, ...)
va_list ap;
int ret;
+ /* Regardless of the value of flag, let __vsprintf_internal know that
+ this is a call from *printf_chk. */
+ mode |= PRINTF_CHK;
+
if (slen == 0)
__chk_fail ();
@@ -25,6 +25,10 @@ ___vsprintf_chk (char *s, int flag, size_t slen, const char *format,
can only come from read-only format strings. */
unsigned int mode = (flag > 0) ? PRINTF_FORTIFY : 0;
+ /* Regardless of the value of flag, let __vsprintf_internal know that
+ this is a call from *printf_chk. */
+ mode |= PRINTF_CHK;
+
if (slen == 0)
__chk_fail ();
@@ -64,7 +64,8 @@ tests = tst_swprintf tst_wprintf tst_swscanf tst_wscanf tst_getwc tst_putwc \
bug-memstream1 bug-wmemstream1 \
tst-setvbuf1 tst-popen1 tst-fgetwc bug-wsetpos tst-fseek \
tst-fwrite-error tst-ftell-partial-wide tst-ftell-active-handler \
- tst-ftell-append tst-fputws tst-bz22415 tst-fgetc-after-eof
+ tst-ftell-append tst-fputws tst-bz22415 tst-fgetc-after-eof \
+ tst-sprintf-ub tst-sprintf-chk-ub
tests-internal = tst-vtables tst-vtables-interposed tst-readline
@@ -152,6 +153,10 @@ CFLAGS-oldtmpfile.c += -fexceptions
CFLAGS-tst_putwc.c += -DOBJPFX=\"$(objpfx)\"
+# These test cases intentionally use overlapping arguments
+CFLAGS-tst-sprintf-ub.c += -Wno-restrict
+CFLAGS-tst-sprintf-chk-ub.c += -Wno-restrict
+
tst_wprintf2-ARGS = "Some Text"
test-fmemopen-ENV = MALLOC_TRACE=$(objpfx)test-fmemopen.mtrace
@@ -77,8 +77,18 @@ __vsprintf_internal (char *string, size_t maxlen,
sf._sbf._f._lock = NULL;
#endif
_IO_no_init (&sf._sbf._f, _IO_USER_LOCK, -1, NULL, NULL);
- _IO_JUMPS (&sf._sbf) = &_IO_str_chk_jumps;
- string[0] = '\0';
+ /* When called from fortified sprintf/vsprintf, erase the destination
+ buffer and try to detect overflows. When called from regular
+ sprintf/vsprintf, do not erase the destination buffer, because
+ known user code relies on this behavior (even though its undefined
+ by ISO C), nor try to detect overflows. */
+ if ((mode_flags & PRINTF_CHK) != 0)
+ {
+ _IO_JUMPS (&sf._sbf) = &_IO_str_chk_jumps;
+ string[0] = '\0';
+ }
+ else
+ _IO_JUMPS (&sf._sbf) = &_IO_str_jumps;
_IO_str_init_static_internal (&sf, string,
(maxlen == -1) ? -1 : maxlen - 1,
string);
@@ -705,9 +705,13 @@ extern int __vswprintf_internal (wchar_t *string, size_t maxlen,
PRINTF_FORTIFY, when set to one, indicates that fortification checks
are to be performed in input parameters. This is used by the
__*printf_chk functions, which are used when _FORTIFY_SOURCE is
- defined to 1 or 2. Otherwise, such checks are ignored. */
+ defined to 1 or 2. Otherwise, such checks are ignored.
+
+ PRINTF_CHK indicates, to the internal function being called, that the
+ call is originated from one of the __*printf_chk functions. */
#define PRINTF_LDBL_IS_DBL 0x0001
#define PRINTF_FORTIFY 0x0002
+#define PRINTF_CHK 0x0004
extern size_t _IO_getline (FILE *,char *, size_t, int, int);
libc_hidden_proto (_IO_getline)
new file mode 100644
@@ -0,0 +1,2 @@
+#define _FORTIFY_SOURCE 2
+#include <tst-sprintf-ub.c>
new file mode 100644
@@ -0,0 +1,102 @@
+/* Test the sprintf (buf, "%s", buf) does not override buf.
+ Copyright (C) 2018 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <string.h>
+
+#include <support/check.h>
+
+enum
+{
+ FUNCTION_FIRST,
+ FUNCTION_SPRINTF = FUNCTION_FIRST,
+ FUNCTION_SNPRINF,
+ FUNCTION_VSPRINTF,
+ FUNCTION_VSNPRINTF,
+ FUNCTION_LAST
+};
+
+static void
+do_one_test (int function, char *buf, ...)
+{
+ va_list args;
+ char *arg;
+
+ /* Expected results for fortified and non-fortified sprintf. */
+#if defined _FORTIFY_SOURCE && _FORTIFY_SOURCE > 1
+ const char *expected = "CD";
+#else
+ const char *expected = "ABCD";
+#endif
+
+ va_start (args, buf);
+ arg = va_arg (args, char *);
+ va_end (args);
+
+ switch (function)
+ {
+ /* The regular sprintf and vsprintf functions do not override the
+ destination buffer, even if source and destination overlap. */
+ case FUNCTION_SPRINTF:
+ sprintf (buf, "%sCD", buf);
+ TEST_COMPARE_STRING (buf, expected);
+ break;
+ case FUNCTION_VSPRINTF:
+ va_start (args, buf);
+ vsprintf (arg, "%sCD", args);
+ TEST_COMPARE_STRING (arg, expected);
+ va_end (args);
+ break;
+ /* On the other hand, snprint and vsnprint override overlapping
+ source and destination buffers. */
+ case FUNCTION_SNPRINF:
+ snprintf (buf, 3, "%sCD", buf);
+ TEST_COMPARE_STRING (buf, "CD");
+ break;
+ case FUNCTION_VSNPRINTF:
+ va_start (args, buf);
+ vsnprintf (arg, 3, "%sCD", args);
+ TEST_COMPARE_STRING (arg, "CD");
+ va_end (args);
+ break;
+ default:
+ support_record_failure ();
+ }
+}
+
+static int
+do_test (void)
+{
+ char buf[8];
+ int i;
+
+ /* For each function in the enum do:
+ - reset the buffer to the initial state "AB";
+ - call the function with buffer as source and destination;
+ - check for the desired behavior. */
+ for (i = FUNCTION_FIRST; i < FUNCTION_LAST; i++)
+ {
+ strncpy (buf, "AB", 3);
+ do_one_test (i, buf, buf);
+ }
+
+ return 0;
+}
+
+#include <support/test-driver.c>