[COMMITTED] Do not require memset elimination in explicit_bzero test
Commit Message
On 12/30/2016 01:16 PM, Florian Weimer wrote:
> On 12/21/2016 07:04 PM, Florian Weimer wrote:
>> On 12/20/2016 11:09 AM, Florian Weimer wrote:
>>> Some targets fail to apply dead store elimination to the
>>> memset call in setup_ordinary_clear. Before this commit,
>>> this causes the test case to fail. Instead, the test case
>>> now logs lack of memset elimination as an informational
>>> message.
>>>
>>> 2016-12-20 Florian Weimer <fweimer@redhat.com>
>>>
>>> Do not require memset elimination in explicit_bzero test.
>>> * string/tst-xbzero-opt.c (prepare_test_buffer): Force inlining.
>>> (enum test_expectation): Add NO_EXPECTATIONS.
>>> (subtests): NO_EXPECTATIONS for ordinary clear.
>>> (check_test_buffer): Handle NO_EXPECTATIONS.
>>> * string/Makefile (CFLAGS-tst-xbzero-opt.c): Compile with -O3.
>>
>> Stefan, this test still fails for me on s390x:
>>
>> PASS: no clear/prepare: expected 32 got 32
>> PASS: no clear/test: expected some got 32
>> PASS: ordinary clear/prepare: expected 32 got 32
>> INFO: ordinary clear/test: found 0 patterns (memset not eliminated)
>> PASS: explicit clear/prepare: expected 32 got 32
>> FAIL: explicit clear/test: expected 0 got 1
>>
>> Do you have an idea what's going on there?
>
> I filed bug 21006 and will add it as a release blocker.
>
> Thanks,
> Florian
>
Hi Florian,
the test is also failing on my system and I've had a look into it.
In setup_explicit_clear, the buffer is filled with the test_pattern.
On s390x the memcpy in prepare_test_buffer is done by loading
r4 / r5 with the test_pattern and using store multiple instruction
to store r4 / r5 to buf.
If explicit_bzero is resolved in setup_explicit_clear, r4 / r5 is
stored to stack by _dl_runtime_resolve and the call to memmem in
count_test_patterns finds a hit of the test_pattern on the stack.
The attached patch resolves all symbols at program startup by linking
with -z now. This omits the call of _dl_runtime_resolve within
setup_explicit_clear and the test passes.
If this is okay, I'll commit this patch and clear this bug in the
release blockers list in the release-wiki.
Bye
Stefan
ChangeLog:
[BZ #21006]
* string/Makefile (LDFLAGS-tst-xbzero-opt): New variable.
Comments
On 01/10/2017 09:22 AM, Stefan Liebler wrote:
> On 12/30/2016 01:16 PM, Florian Weimer wrote:
>> On 12/21/2016 07:04 PM, Florian Weimer wrote:
>>> On 12/20/2016 11:09 AM, Florian Weimer wrote:
>>>> Some targets fail to apply dead store elimination to the
>>>> memset call in setup_ordinary_clear. Before this commit,
>>>> this causes the test case to fail. Instead, the test case
>>>> now logs lack of memset elimination as an informational
>>>> message.
>>>>
>>>> 2016-12-20 Florian Weimer <fweimer@redhat.com>
>>>>
>>>> Do not require memset elimination in explicit_bzero test.
>>>> * string/tst-xbzero-opt.c (prepare_test_buffer): Force inlining.
>>>> (enum test_expectation): Add NO_EXPECTATIONS.
>>>> (subtests): NO_EXPECTATIONS for ordinary clear.
>>>> (check_test_buffer): Handle NO_EXPECTATIONS.
>>>> * string/Makefile (CFLAGS-tst-xbzero-opt.c): Compile with -O3.
>>>
>>> Stefan, this test still fails for me on s390x:
>>>
>>> PASS: no clear/prepare: expected 32 got 32
>>> PASS: no clear/test: expected some got 32
>>> PASS: ordinary clear/prepare: expected 32 got 32
>>> INFO: ordinary clear/test: found 0 patterns (memset not eliminated)
>>> PASS: explicit clear/prepare: expected 32 got 32
>>> FAIL: explicit clear/test: expected 0 got 1
>>>
>>> Do you have an idea what's going on there?
>>
>> I filed bug 21006 and will add it as a release blocker.
>>
>> Thanks,
>> Florian
>>
> Hi Florian,
>
> the test is also failing on my system and I've had a look into it.
>
> In setup_explicit_clear, the buffer is filled with the test_pattern.
> On s390x the memcpy in prepare_test_buffer is done by loading
> r4 / r5 with the test_pattern and using store multiple instruction
> to store r4 / r5 to buf.
> If explicit_bzero is resolved in setup_explicit_clear, r4 / r5 is
> stored to stack by _dl_runtime_resolve and the call to memmem in
> count_test_patterns finds a hit of the test_pattern on the stack.
>
> The attached patch resolves all symbols at program startup by linking
> with -z now. This omits the call of _dl_runtime_resolve within
> setup_explicit_clear and the test passes.
>
> If this is okay, I'll commit this patch and clear this bug in the
> release blockers list in the release-wiki.
>
> Bye
> Stefan
>
> ChangeLog:
>
> [BZ #21006]
> * string/Makefile (LDFLAGS-tst-xbzero-opt): New variable.
>
PING
On Mon, Jan 16, 2017 at 3:24 AM, Stefan Liebler <stli@linux.vnet.ibm.com> wrote:
> On 01/10/2017 09:22 AM, Stefan Liebler wrote:
>>
>> In setup_explicit_clear, the buffer is filled with the test_pattern.
>> On s390x the memcpy in prepare_test_buffer is done by loading
>> r4 / r5 with the test_pattern and using store multiple instruction
>> to store r4 / r5 to buf.
>> If explicit_bzero is resolved in setup_explicit_clear, r4 / r5 is
>> stored to stack by _dl_runtime_resolve and the call to memmem in
>> count_test_patterns finds a hit of the test_pattern on the stack.
>>
>> The attached patch resolves all symbols at program startup by linking
>> with -z now. This omits the call of _dl_runtime_resolve within
>> setup_explicit_clear and the test passes.
>>
>> If this is okay, I'll commit this patch and clear this bug in the
>> release blockers list in the release-wiki.
This seems like a reasonable workaround to me. Please commit.
(Guess we better add "spill slots for callee-save registers, including
registers saved only by dynamic linker stubs" to the list of things to
worry about when adding explicit_bzero to the compiler...)
zw
On 01/16/2017 04:28 PM, Zack Weinberg wrote:
> On Mon, Jan 16, 2017 at 3:24 AM, Stefan Liebler <stli@linux.vnet.ibm.com> wrote:
>> On 01/10/2017 09:22 AM, Stefan Liebler wrote:
>>>
>>> In setup_explicit_clear, the buffer is filled with the test_pattern.
>>> On s390x the memcpy in prepare_test_buffer is done by loading
>>> r4 / r5 with the test_pattern and using store multiple instruction
>>> to store r4 / r5 to buf.
>>> If explicit_bzero is resolved in setup_explicit_clear, r4 / r5 is
>>> stored to stack by _dl_runtime_resolve and the call to memmem in
>>> count_test_patterns finds a hit of the test_pattern on the stack.
>>>
>>> The attached patch resolves all symbols at program startup by linking
>>> with -z now. This omits the call of _dl_runtime_resolve within
>>> setup_explicit_clear and the test passes.
>>>
>>> If this is okay, I'll commit this patch and clear this bug in the
>>> release blockers list in the release-wiki.
>
> This seems like a reasonable workaround to me. Please commit.
>
> (Guess we better add "spill slots for callee-save registers, including
> registers saved only by dynamic linker stubs" to the list of things to
> worry about when adding explicit_bzero to the compiler...)
>
> zw
>
Thanks.
Committed, closed bug and cleared entry in the release blockers list in
the release-wiki.
commit a07f447fd235d1898a56af7317a4ff6a11a603b9
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Tue Jan 10 08:47:13 2017 +0100
S390: Fix FAIL in test string/tst-xbzero-opt [BZ #21006]
On s390x this test failed with:
FAIL: explicit clear/test: expected 0 got 1
In setup_explicit_clear, the buffer is filled with the test_pattern.
On s390x the memcpy in prepare_test_buffer is done by loading
r4 / r5 with the test_pattern and using store multiple instruction
to store r4 / r5 to buf.
If explicit_bzero is resolved in setup_explicit_clear, r4 / r5 is
stored to stack by _dl_runtime_resolve and the call to memmem in
count_test_patterns finds a hit of the test_pattern on the stack.
This patch resolves all symbols at program startup by linking with
-z now. This omits the call of _dl_runtime_resolve within
setup_explicit_clear and the test passes.
ChangeLog:
[BZ #21006]
* string/Makefile (LDFLAGS-tst-xbzero-opt): New variable.
@@ -73,6 +73,14 @@ CFLAGS-stratcliff.c = -fno-builtin
CFLAGS-test-ffs.c = -fno-builtin
CFLAGS-tst-inlcall.c = -fno-builtin
CFLAGS-tst-xbzero-opt.c = -O3
+# BZ 21006: Resolve all functions but at least explicit_bzero at startup.
+# Otherwise the test fails on s390x as the memcpy in prepare_test_buffer is
+# done by loading r4 / r5 with the test_pattern and using store multiple
+# instruction to store r4 / r5 to buf. If explicit_bzero would be resolved in
+# setup_explicit_clear, r4 / r5 would be stored to stack by _dl_runtime_resolve
+# and the call to memmem in count_test_patterns will find a hit of the
+# test_pattern on the stack.
+LDFLAGS-tst-xbzero-opt = -z now
# Called during TLS initialization.
CFLAGS-memcpy.c = $(no-stack-protector)