Message ID | 5B8FD8B302000078001E5940@prv1-mh.provo.novell.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 27049 invoked by alias); 5 Sep 2018 13:23:04 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 27040 invoked by uid 89); 5 Sep 2018 13:23:04 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 spammy=whatsoever X-HELO: prv1-mh.provo.novell.com Received: from prv1-mh.provo.novell.com (HELO prv1-mh.provo.novell.com) (137.65.248.33) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 05 Sep 2018 13:23:02 +0000 Received: from INET-PRV1-MTA by prv1-mh.provo.novell.com with Novell_GroupWise; Wed, 05 Sep 2018 07:23:01 -0600 Message-Id: <5B8FD8B302000078001E5940@prv1-mh.provo.novell.com> Date: Wed, 05 Sep 2018 07:22:59 -0600 From: "Jan Beulich" <JBeulich@suse.com> To: "GDB" <gdb-patches@sourceware.org> Subject: [PATCH] x86-64: fix ZMM register state tracking Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Content-Disposition: inline |
Commit Message
Jan Beulich
Sept. 5, 2018, 1:22 p.m. UTC
The three AVX512 state components are entirely independent - one being in its "init state" has no implication whatsoever on either of the other two. Fully separate X86_XSTATE_ZMM_H and X86_XSTATE_ZMM handling, to prevent upper halves of the upper 16 ZMM registers to display as if they were zero (when they aren't) after e.g. VZEROALL/VZEROUPPER. gdb/ 2018-09-05 Jan Beulich <jbeulich@suse.com> * i387-tdep.c (i387_supply_xsave): Split handling of X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. (i387_collect_xsave): Likewise.
Comments
Hi Jan, Would it be possible to update or create a test to exercise that? arch-specific tests are in testsuite/gdb.arch. I have some comments about the form. Tim, could you or somebody else from Intel take a look at the functional side of the changes? On 2018-09-05 02:22 PM, Jan Beulich wrote: > The three AVX512 state components are entirely independent - one being > in its "init state" has no implication whatsoever on either of the other > two. Fully separate X86_XSTATE_ZMM_H and X86_XSTATE_ZMM handling, to > prevent upper halves of the upper 16 ZMM registers to display as if they > were zero (when they aren't) after e.g. VZEROALL/VZEROUPPER. > > gdb/ > 2018-09-05 Jan Beulich <jbeulich@suse.com> > > * i387-tdep.c (i387_supply_xsave): Split handling of > X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. > (i387_collect_xsave): Likewise. > > --- a/gdb/i387-tdep.c > +++ b/gdb/i387-tdep.c > @@ -923,7 +923,8 @@ i387_supply_xsave (struct regcache *regc > enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); > struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch); > const gdb_byte *regs = (const gdb_byte *) xsave; > - int i; > + int i, zmm_endlo_regnum = I387_ZMM0H_REGNUM (tdep) > + + std::min (tdep->num_zmm_regs, 16); The GNU standard requires to parenthesis when breaking lines: int i, zmm_endlo_regnum = (I387_ZMM0H_REGNUM (tdep) + std::min (tdep->num_zmm_regs, 16)); Simon
>>> Simon Marchi <simon.marchi@ericsson.com> 09/08/18 1:13 AM >>> >Would it be possible to update or create a test to exercise that? >arch-specific tests are in testsuite/gdb.arch. I'm sure it would be possible, but while I was happy to invest the time to fix the actual bug (because it affects work I'm doing), I'm afraid I don't have the time to learn how gdb test cases are to be constructed (I'm familiar only with the binutils / gas side of things). >On 2018-09-05 02:22 PM, Jan Beulich wrote: >> --- a/gdb/i387-tdep.c >> +++ b/gdb/i387-tdep.c >> @@ -923,7 +923,8 @@ i387_supply_xsave (struct regcache *regc >> enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); >> struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch); >> const gdb_byte *regs = (const gdb_byte *) xsave; >> - int i; >> + int i, zmm_endlo_regnum = I387_ZMM0H_REGNUM (tdep) >> + + std::min (tdep->num_zmm_regs, 16); > >The GNU standard requires to parenthesis when breaking lines: > >int i, zmm_endlo_regnum = (I387_ZMM0H_REGNUM (tdep) >+ std::min (tdep->num_zmm_regs, 16)); This is easy enough to fix (perhaps even without the need to send a v2, but just before/while committing?); I don't think this is a rule being enforced on the binutils side, so I simply wasn't aware. Jan
On 2018-09-10 07:25, Jan Beulich wrote: >>>> Simon Marchi <simon.marchi@ericsson.com> 09/08/18 1:13 AM >>> >> Would it be possible to update or create a test to exercise that? >> arch-specific tests are in testsuite/gdb.arch. > > I'm sure it would be possible, but while I was happy to invest the time > to > fix the actual bug (because it affects work I'm doing), I'm afraid I > don't have > the time to learn how gdb test cases are to be constructed (I'm > familiar > only with the binutils / gas side of things). I understand. If you can provide: - a minimal source file (C + assembly in this case, I suppose) - GDB commands to reproduce the bug - actual and expected output I can take care of turning it in a GDB test case. >> On 2018-09-05 02:22 PM, Jan Beulich wrote: >>> --- a/gdb/i387-tdep.c >>> +++ b/gdb/i387-tdep.c >>> @@ -923,7 +923,8 @@ i387_supply_xsave (struct regcache *regc >>> enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); >>> struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch); >>> const gdb_byte *regs = (const gdb_byte *) xsave; >>> - int i; >>> + int i, zmm_endlo_regnum = I387_ZMM0H_REGNUM (tdep) >>> + + std::min (tdep->num_zmm_regs, 16); >> >> The GNU standard requires to parenthesis when breaking lines: >> >> int i, zmm_endlo_regnum = (I387_ZMM0H_REGNUM (tdep) >> + std::min (tdep->num_zmm_regs, 16)); > > This is easy enough to fix (perhaps even without the need to send a v2, > but just before/while committing?); I don't think this is a rule being > enforced on the binutils side, so I simply wasn't aware. Yeah, no problem, I can do it. It's suggested in the GNU coding standards, so I think it would apply to binutils too: https://www.gnu.org/prep/standards/html_node/Formatting.html Search for "Adding a set of parentheses". Simon
On 09/05/2018 02:22 PM, Jan Beulich wrote: > The three AVX512 state components are entirely independent - one being > in its "init state" has no implication whatsoever on either of the other > two. Fully separate X86_XSTATE_ZMM_H and X86_XSTATE_ZMM handling, to > prevent upper halves of the upper 16 ZMM registers to display as if they > were zero (when they aren't) after e.g. VZEROALL/VZEROUPPER. > > gdb/ > 2018-09-05 Jan Beulich <jbeulich@suse.com> > > * i387-tdep.c (i387_supply_xsave): Split handling of > X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. > (i387_collect_xsave): Likewise. Does gdb/gdbserver/i387-fp.c need similar treatment? Thanks, Pedro Alves
>>> On 11.09.18 at 12:34, <palves@redhat.com> wrote: > On 09/05/2018 02:22 PM, Jan Beulich wrote: >> The three AVX512 state components are entirely independent - one being >> in its "init state" has no implication whatsoever on either of the other >> two. Fully separate X86_XSTATE_ZMM_H and X86_XSTATE_ZMM handling, to >> prevent upper halves of the upper 16 ZMM registers to display as if they >> were zero (when they aren't) after e.g. VZEROALL/VZEROUPPER. >> >> gdb/ >> 2018-09-05 Jan Beulich <jbeulich@suse.com> >> >> * i387-tdep.c (i387_supply_xsave): Split handling of >> X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. >> (i387_collect_xsave): Likewise. > > Does gdb/gdbserver/i387-fp.c need similar treatment? Not afaics - there's no place where both flags would be tested at the same time (other than was the case here before this patch). Jan
>>> On 10.09.18 at 15:01, <simon.marchi@polymtl.ca> wrote: > On 2018-09-10 07:25, Jan Beulich wrote: >>>>> Simon Marchi <simon.marchi@ericsson.com> 09/08/18 1:13 AM >>> >>> Would it be possible to update or create a test to exercise that? >>> arch-specific tests are in testsuite/gdb.arch. >> >> I'm sure it would be possible, but while I was happy to invest the time >> to >> fix the actual bug (because it affects work I'm doing), I'm afraid I >> don't have >> the time to learn how gdb test cases are to be constructed (I'm >> familiar >> only with the binutils / gas side of things). > > I understand. If you can provide: > > - a minimal source file (C + assembly in this case, I suppose) > - GDB commands to reproduce the bug > - actual and expected output Attached. vzero.s is the source file used (no C file needed). gdb.log is a transcript of a session with a broken gdb (the one installed on the system), while gdb.txt is a transcript for the fixed one that I've built myself. > I can take care of turning it in a GDB test case. Thanks. Jan [jbeulich@dus-dev-sles15v ~]$ gcc -Wall -W -o vzero{,.s} && gdbx ./vzero GNU gdb (GDB) 8.1.1 Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./vzero...done. (gdb) break main Breakpoint 1 at 0x400497 (gdb) r Starting program: /home/jbeulich/vzero Breakpoint 1, 0x0000000000400497 in main () (gdb) disp/i $pc 2: x/i $pc => 0x400497 <main>: vpternlogd zmm0,zmm0,zmm0,0xff (gdb) si 0x000000000040049e in main () 2: x/i $pc => 0x40049e <main+7>: vpternlogd zmm16,zmm0,zmm0,0xff (gdb) 0x00000000004004a5 in main () 2: x/i $pc => 0x4004a5 <main+14>: vzeroupper (gdb) print $zmm0.v16_int32 $1 = {-1 <repeats 16 times>} (gdb) print $zmm16.v16_int32 $2 = {-1 <repeats 16 times>} (gdb) si 0x00000000004004a8 in main () 2: x/i $pc => 0x4004a8 <main+17>: xor eax,eax (gdb) print $zmm16.v16_int32 $3 = {-1 <repeats 16 times>} (gdb) q A debugging session is active. Inferior 1 [process 56857] will be killed. Quit anyway? (y or n) y
Hello Jan, > The three AVX512 state components are entirely independent - one being in its "init > state" has no implication whatsoever on either of the other two. Fully separate > X86_XSTATE_ZMM_H and X86_XSTATE_ZMM handling, to prevent upper halves of > the upper 16 ZMM registers to display as if they were zero (when they aren't) after > e.g. VZEROALL/VZEROUPPER. > > gdb/ > 2018-09-05 Jan Beulich <jbeulich@suse.com> > > * i387-tdep.c (i387_supply_xsave): Split handling of > X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. > (i387_collect_xsave): Likewise. > > --- a/gdb/i387-tdep.c > +++ b/gdb/i387-tdep.c > @@ -923,7 +923,8 @@ i387_supply_xsave (struct regcache *regc > enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); > struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch); > const gdb_byte *regs = (const gdb_byte *) xsave; > - int i; > + int i, zmm_endlo_regnum = I387_ZMM0H_REGNUM (tdep) > + + std::min (tdep->num_zmm_regs, 16); It would be nice to comment on this magic 16 and the min operation. It's how XSAVE organizes things but it isn't entirely intuitive. > ULONGEST clear_bv; > static const gdb_byte zero[I386_MAX_REGISTER_SIZE] = { 0 }; > enum > @@ -1002,7 +1003,8 @@ i387_supply_xsave (struct regcache *regc > return; > > case avx512_zmm_h: > - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > + if ((clear_bv & (regnum < zmm_endlo_regnum ? X86_XSTATE_ZMM_H > + : X86_XSTATE_ZMM))) A comment that XSAVE stores the lower 16 registers in a different place than the higher 16 registers and also guards them by different XCR0 bits would be nice. We hid the different places behind those XSAVE_AVX512_ZMM_H_ADDR macros but there's nothing similar for the guard bits. Maybe add macros for the guard check, as well? > regcache->raw_supply (regnum, zero); > else > regcache->raw_supply (regnum, > @@ -1080,21 +1082,17 @@ i387_supply_xsave (struct regcache *regc > } > } > > - /* Handle the upper ZMM registers. */ > - if ((tdep->xcr0 & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > + /* Handle the upper halves of the low 8/16 ZMM registers. */ > + if ((tdep->xcr0 & X86_XSTATE_ZMM_H)) > { > - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > + if ((clear_bv & X86_XSTATE_ZMM_H)) > { > - for (i = I387_ZMM0H_REGNUM (tdep); > - i < I387_ZMMENDH_REGNUM (tdep); > - i++) > + for (i = I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) > regcache->raw_supply (i, zero); > } > else > { > - for (i = I387_ZMM0H_REGNUM (tdep); > - i < I387_ZMMENDH_REGNUM (tdep); > - i++) > + for (i = I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) > regcache->raw_supply (i, > XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i)); > } > @@ -1119,11 +1117,13 @@ i387_supply_xsave (struct regcache *regc > } > } > > - /* Handle the YMM_AVX512 registers. */ > + /* Handle the upper 16 ZMM/YMM/XMM registers (if any). */ > if ((tdep->xcr0 & X86_XSTATE_ZMM)) > { > if ((clear_bv & X86_XSTATE_ZMM)) > { > + for (i = zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) > + regcache->raw_supply (i, zero); > for (i = I387_YMM16H_REGNUM (tdep); > i < I387_YMMH_AVX512_END_REGNUM (tdep); > i++) > @@ -1135,6 +1135,9 @@ i387_supply_xsave (struct regcache *regc > } > else > { > + for (i = zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) > + regcache->raw_supply (i, > + XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i)); This covers the upper halves of zmm16 to zmm31. Looking at the function it looks like the lower halves are covered in separate cases avx512_ymmh_avx512 and avx512_xmmh_avx512. Maybe reflect this in the comment? It currently suggests that it handles the entire upper 16 registers. > for (i = I387_YMM16H_REGNUM (tdep); > i < I387_YMMH_AVX512_END_REGNUM (tdep); > i++) > @@ -1340,7 +1343,8 @@ i387_collect_xsave (const struct regcach > gdb_byte *p, *regs = (gdb_byte *) xsave; > gdb_byte raw[I386_MAX_REGISTER_SIZE]; > ULONGEST initial_xstate_bv, clear_bv, xstate_bv = 0; > - unsigned int i; > + unsigned int i, zmm_endlo_regnum = I387_ZMM0H_REGNUM (tdep) > + + std::min (tdep->num_zmm_regs, 16); > enum > { > x87_ctrl_or_mxcsr = 0x1, > @@ -1441,9 +1445,8 @@ i387_collect_xsave (const struct regcach > i < I387_MPXEND_REGNUM (tdep); i++) > memset (XSAVE_MPX_ADDR (tdep, regs, i), 0, 8); > > - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > - for (i = I387_ZMM0H_REGNUM (tdep); > - i < I387_ZMMENDH_REGNUM (tdep); i++) > + if ((clear_bv & X86_XSTATE_ZMM_H)) > + for (i = I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) > memset (XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i), 0, 32); > > if ((clear_bv & X86_XSTATE_K)) > @@ -1453,6 +1456,8 @@ i387_collect_xsave (const struct regcach > > if ((clear_bv & X86_XSTATE_ZMM)) > { > + for (i = zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) > + memset (XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i), 0, 32); > for (i = I387_YMM16H_REGNUM (tdep); > i < I387_YMMH_AVX512_END_REGNUM (tdep); i++) > memset (XSAVE_YMM_AVX512_ADDR (tdep, regs, i), 0, 16); Looks OK to me. Regards, Markus. Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928
>>> On 24.09.18 at 19:19, <markus.t.metzger@intel.com> wrote: >> --- a/gdb/i387-tdep.c >> +++ b/gdb/i387-tdep.c >> @@ -923,7 +923,8 @@ i387_supply_xsave (struct regcache *regc >> enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); >> struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch); >> const gdb_byte *regs = (const gdb_byte *) xsave; >> - int i; >> + int i, zmm_endlo_regnum = I387_ZMM0H_REGNUM (tdep) >> + + std::min (tdep->num_zmm_regs, 16); > > It would be nice to comment on this magic 16 and the min operation. > It's how XSAVE organizes things but it isn't entirely intuitive. Okay, I can accept the desire for a (new) comment here, albeit it pretty much goes along the lines of what I say further down. >> @@ -1002,7 +1003,8 @@ i387_supply_xsave (struct regcache *regc >> return; >> >> case avx512_zmm_h: >> - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) >> + if ((clear_bv & (regnum < zmm_endlo_regnum ? X86_XSTATE_ZMM_H >> + : X86_XSTATE_ZMM))) > > A comment that XSAVE stores the lower 16 registers in a different place > than the higher 16 registers and also guards them by different XCR0 bits > would be nice. Such a comment would have been appropriate already prior to my patch. I have to admit that I'm not inclined to invest time in improving pre-existing (lack of) commentary, the more that how to exactly word this might be controversial. If that's really a requirement, then I'd abandon the patch (perhaps for someone else to pick up), maintaining it locally until the bug has been fixed upstream by whatever means. This is even more so that the extra time I'll have to put into playing with Simon's test case is also nothing I really have time for (but I'll try to find the time nevertheless). > We hid the different places behind those XSAVE_AVX512_ZMM_H_ADDR > macros but there's nothing similar for the guard bits. Maybe add macros > for the guard check, as well? Same here - nothing that this patch really changes. >> @@ -1135,6 +1135,9 @@ i387_supply_xsave (struct regcache *regc >> } >> else >> { >> + for (i = zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) >> + regcache->raw_supply (i, >> + XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i)); > > This covers the upper halves of zmm16 to zmm31. Looking at the function it looks > like the lower halves are covered in separate cases avx512_ymmh_avx512 and > avx512_xmmh_avx512. Maybe reflect this in the comment? It currently suggests > that it handles the entire upper 16 registers. Same here - the split among code regions has been there before (also for e.g. the YMM register lower and upper halves). Irrespective of my mostly negative responses thanks for your comments, Jan
>>> On 10.10.18 at 17:12, wrote: > The three AVX512 state components are entirely independent - one being > in its "init state" has no implication whatsoever on either of the other > two. Fully separate X86_XSTATE_ZMM_H and X86_XSTATE_ZMM handling, to > prevent upper halves of the upper 16 ZMM registers to display as if they > were zero (when they aren't) after e.g. VZEROALL/VZEROUPPER. > > gdb/ > 2018-10-10 Jan Beulich <jbeulich@suse.com> > > * i387-tdep.c (i387_supply_xsave): Split handling of > X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. > (i387_collect_xsave): Likewise. > > gdb/testsuite/ > 2018-10-10 Simon Marchi <simon.marchi@polymtl.ca> > > * testsuite/gdb.arch/i386-avx512.c, > testsuite/gdb.arch/i386-avx512.exp: Add 7th test. > > --- > v2: Attach comments to zmm_endlo_regnum declarations. Add testcase > provided by Simon. > > --- a/gdb/i387-tdep.c > +++ b/gdb/i387-tdep.c > @@ -924,6 +924,12 @@ i387_supply_xsave (struct regcache *regc > struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch); > const gdb_byte *regs = (const gdb_byte *) xsave; > int i; > + /* In 64-bit mode the split between "low" and "high" ZMM registers is at > + ZMM16. Outside of 64-bit mode there are no "high" ZMM registers at > all. > + Precalculate the number to be used for the split point, with the all > + registers in the "low" portion outside of 64-bit mode. */ > + unsigned int zmm_endlo_regnum = I387_ZMM0H_REGNUM (tdep) > + + std::min (tdep->num_zmm_regs, 16); > ULONGEST clear_bv; > static const gdb_byte zero[I386_MAX_REGISTER_SIZE] = { 0 }; > enum > @@ -1002,7 +1008,8 @@ i387_supply_xsave (struct regcache *regc > return; > > case avx512_zmm_h: > - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > + if ((clear_bv & (regnum < zmm_endlo_regnum ? X86_XSTATE_ZMM_H > + : X86_XSTATE_ZMM))) > regcache->raw_supply (regnum, zero); > else > regcache->raw_supply (regnum, > @@ -1080,21 +1087,17 @@ i387_supply_xsave (struct regcache *regc > } > } > > - /* Handle the upper ZMM registers. */ > - if ((tdep->xcr0 & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > + /* Handle the upper halves of the low 8/16 ZMM registers. */ > + if ((tdep->xcr0 & X86_XSTATE_ZMM_H)) > { > - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > + if ((clear_bv & X86_XSTATE_ZMM_H)) > { > - for (i = I387_ZMM0H_REGNUM (tdep); > - i < I387_ZMMENDH_REGNUM (tdep); > - i++) > + for (i = I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) > regcache->raw_supply (i, zero); > } > else > { > - for (i = I387_ZMM0H_REGNUM (tdep); > - i < I387_ZMMENDH_REGNUM (tdep); > - i++) > + for (i = I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) > regcache->raw_supply (i, > XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i)); > } > @@ -1119,11 +1122,13 @@ i387_supply_xsave (struct regcache *regc > } > } > > - /* Handle the YMM_AVX512 registers. */ > + /* Handle the upper 16 ZMM/YMM/XMM registers (if any). */ > if ((tdep->xcr0 & X86_XSTATE_ZMM)) > { > if ((clear_bv & X86_XSTATE_ZMM)) > { > + for (i = zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) > + regcache->raw_supply (i, zero); > for (i = I387_YMM16H_REGNUM (tdep); > i < I387_YMMH_AVX512_END_REGNUM (tdep); > i++) > @@ -1135,6 +1140,9 @@ i387_supply_xsave (struct regcache *regc > } > else > { > + for (i = zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) > + regcache->raw_supply (i, > + XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i)); > for (i = I387_YMM16H_REGNUM (tdep); > i < I387_YMMH_AVX512_END_REGNUM (tdep); > i++) > @@ -1341,6 +1349,9 @@ i387_collect_xsave (const struct regcach > gdb_byte raw[I386_MAX_REGISTER_SIZE]; > ULONGEST initial_xstate_bv, clear_bv, xstate_bv = 0; > unsigned int i; > + /* See the comment in i387_supply_xsave(). */ > + unsigned int zmm_endlo_regnum = I387_ZMM0H_REGNUM (tdep) > + + std::min (tdep->num_zmm_regs, 16); > enum > { > x87_ctrl_or_mxcsr = 0x1, > @@ -1441,9 +1452,8 @@ i387_collect_xsave (const struct regcach > i < I387_MPXEND_REGNUM (tdep); i++) > memset (XSAVE_MPX_ADDR (tdep, regs, i), 0, 8); > > - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) > - for (i = I387_ZMM0H_REGNUM (tdep); > - i < I387_ZMMENDH_REGNUM (tdep); i++) > + if ((clear_bv & X86_XSTATE_ZMM_H)) > + for (i = I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) > memset (XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i), 0, 32); > > if ((clear_bv & X86_XSTATE_K)) > @@ -1453,6 +1463,8 @@ i387_collect_xsave (const struct regcach > > if ((clear_bv & X86_XSTATE_ZMM)) > { > + for (i = zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) > + memset (XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i), 0, 32); > for (i = I387_YMM16H_REGNUM (tdep); > i < I387_YMMH_AVX512_END_REGNUM (tdep); i++) > memset (XSAVE_YMM_AVX512_ADDR (tdep, regs, i), 0, 16); > --- a/gdb/testsuite/gdb.arch/i386-avx512.c > +++ b/gdb/testsuite/gdb.arch/i386-avx512.c > @@ -249,6 +249,13 @@ main (int argc, char **argv) > move back to array and check values. */ > move_zmm_data_to_memory (); > asm ("nop"); /* sixth breakpoint here */ > + > + asm ("vpternlogd $0xff, %zmm0, %zmm0, %zmm0"); > +#ifdef __x86_64__ > + asm ("vpternlogd $0xff, %zmm0, %zmm0, %zmm16"); > +#endif > + asm ("vzeroupper"); > + asm ("nop"); /* seventh breakpoint here */ > } > > return 0; > --- a/gdb/testsuite/gdb.arch/i386-avx512.exp > +++ b/gdb/testsuite/gdb.arch/i386-avx512.exp > @@ -174,3 +174,13 @@ for { set r 0 } { $r < $nr_regs } { incr > ".. = \\{f = \\{[expr $r + 30], [expr $r.125 + 30], [expr $r.25 + > 20], [expr $r.375 + 20], [expr $r.5 + 10], [expr $r.625 + 10], [expr $r.75 + > 10], [expr $r.875 + 10]\\}\\}.*" \ > "check contents of zmm_data\[$r\] after writing XMM regs" > } > + > +gdb_test "break [gdb_get_line_number "seventh breakpoint here"]" \ > + "Breakpoint .* at .*i386-avx512.c.*" \ > + "set seventh breakpoint in main" > +gdb_continue_to_breakpoint "continue to seventh breakpoint in main" > +gdb_test "print \$zmm0.v16_int32" "= {-1, -1, -1, -1, 0 <repeats 12 > times>}" > + > +if { $nr_regs >= 16 } { > + gdb_test "print \$zmm16.v16_int32" "= {-1 <repeats 16 times>}" > +} > > > >
On 2018-10-29 06:31, Jan Beulich wrote: >>>> On 10.10.18 at 17:12, wrote: >> The three AVX512 state components are entirely independent - one being >> in its "init state" has no implication whatsoever on either of the >> other >> two. Fully separate X86_XSTATE_ZMM_H and X86_XSTATE_ZMM handling, to >> prevent upper halves of the upper 16 ZMM registers to display as if >> they >> were zero (when they aren't) after e.g. VZEROALL/VZEROUPPER. >> >> gdb/ >> 2018-10-10 Jan Beulich <jbeulich@suse.com> >> >> * i387-tdep.c (i387_supply_xsave): Split handling of >> X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. >> (i387_collect_xsave): Likewise. >> >> gdb/testsuite/ >> 2018-10-10 Simon Marchi <simon.marchi@polymtl.ca> >> >> * testsuite/gdb.arch/i386-avx512.c, >> testsuite/gdb.arch/i386-avx512.exp: Add 7th test. >> >> --- >> v2: Attach comments to zmm_endlo_regnum declarations. Add testcase >> provided by Simon. The testcase obviously LGTM. I will let Markus approve the other changes. Simon
> On 2018-10-29 06:31, Jan Beulich wrote: > >>>> On 10.10.18 at 17:12, wrote: > >> The three AVX512 state components are entirely independent - one > >> being in its "init state" has no implication whatsoever on either of > >> the other two. Fully separate X86_XSTATE_ZMM_H and X86_XSTATE_ZMM > >> handling, to prevent upper halves of the upper 16 ZMM registers to > >> display as if they were zero (when they aren't) after e.g. > >> VZEROALL/VZEROUPPER. > >> > >> gdb/ > >> 2018-10-10 Jan Beulich <jbeulich@suse.com> > >> > >> * i387-tdep.c (i387_supply_xsave): Split handling of > >> X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. > >> (i387_collect_xsave): Likewise. > >> > >> gdb/testsuite/ > >> 2018-10-10 Simon Marchi <simon.marchi@polymtl.ca> > >> > >> * testsuite/gdb.arch/i386-avx512.c, > >> testsuite/gdb.arch/i386-avx512.exp: Add 7th test. > >> > >> --- > >> v2: Attach comments to zmm_endlo_regnum declarations. Add testcase > >> provided by Simon. > > The testcase obviously LGTM. I will let Markus approve the other changes. The code already looked good to me in v1. Thanks for adding comments. Markus. Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928
>>> On 07.11.18 at 10:07, <markus.t.metzger@intel.com> wrote: >> On 2018-10-29 06:31, Jan Beulich wrote: >> >>>> On 10.10.18 at 17:12, wrote: >> >> The three AVX512 state components are entirely independent - one >> >> being in its "init state" has no implication whatsoever on either of >> >> the other two. Fully separate X86_XSTATE_ZMM_H and X86_XSTATE_ZMM >> >> handling, to prevent upper halves of the upper 16 ZMM registers to >> >> display as if they were zero (when they aren't) after e.g. >> >> VZEROALL/VZEROUPPER. >> >> >> >> gdb/ >> >> 2018-10-10 Jan Beulich <jbeulich@suse.com> >> >> >> >> * i387-tdep.c (i387_supply_xsave): Split handling of >> >> X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. >> >> (i387_collect_xsave): Likewise. >> >> >> >> gdb/testsuite/ >> >> 2018-10-10 Simon Marchi <simon.marchi@polymtl.ca> >> >> >> >> * testsuite/gdb.arch/i386-avx512.c, >> >> testsuite/gdb.arch/i386-avx512.exp: Add 7th test. >> >> >> >> --- >> >> v2: Attach comments to zmm_endlo_regnum declarations. Add testcase >> >> provided by Simon. >> >> The testcase obviously LGTM. I will let Markus approve the other changes. > > The code already looked good to me in v1. Thanks for adding comments. So can I translate this into an ack for me to commit the change? Or else, who would be the one to give the go-ahead? Jan
> >>> On 07.11.18 at 10:07, <markus.t.metzger@intel.com> wrote: > >> On 2018-10-29 06:31, Jan Beulich wrote: > >> >>>> On 10.10.18 at 17:12, wrote: > >> >> The three AVX512 state components are entirely independent - one > >> >> being in its "init state" has no implication whatsoever on either > >> >> of the other two. Fully separate X86_XSTATE_ZMM_H and > >> >> X86_XSTATE_ZMM handling, to prevent upper halves of the upper 16 > >> >> ZMM registers to display as if they were zero (when they aren't) after e.g. > >> >> VZEROALL/VZEROUPPER. > >> >> > >> >> gdb/ > >> >> 2018-10-10 Jan Beulich <jbeulich@suse.com> > >> >> > >> >> * i387-tdep.c (i387_supply_xsave): Split handling of > >> >> X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. > >> >> (i387_collect_xsave): Likewise. > >> >> > >> >> gdb/testsuite/ > >> >> 2018-10-10 Simon Marchi <simon.marchi@polymtl.ca> > >> >> > >> >> * testsuite/gdb.arch/i386-avx512.c, > >> >> testsuite/gdb.arch/i386-avx512.exp: Add 7th test. > >> >> > >> >> --- > >> >> v2: Attach comments to zmm_endlo_regnum declarations. Add testcase > >> >> provided by Simon. > >> > >> The testcase obviously LGTM. I will let Markus approve the other changes. > > > > The code already looked good to me in v1. Thanks for adding comments. > > So can I translate this into an ack for me to commit the change? > Or else, who would be the one to give the go-ahead? Simon can approve your patch. IIRC Pedro had a question regarding gdbserver. From a first look, it seems to get the feature bits right but does not distinguish 32-bit and 64-bit mode regarding the number of available registers. Did you run the new test you added also in remote configuration? Markus. Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928
>>> On 07.11.18 at 14:18, <markus.t.metzger@intel.com> wrote: >> >>> On 07.11.18 at 10:07, <markus.t.metzger@intel.com> wrote: >> >> On 2018-10-29 06:31, Jan Beulich wrote: >> >> >>>> On 10.10.18 at 17:12, wrote: >> >> >> The three AVX512 state components are entirely independent - one >> >> >> being in its "init state" has no implication whatsoever on either >> >> >> of the other two. Fully separate X86_XSTATE_ZMM_H and >> >> >> X86_XSTATE_ZMM handling, to prevent upper halves of the upper 16 >> >> >> ZMM registers to display as if they were zero (when they aren't) after > e.g. >> >> >> VZEROALL/VZEROUPPER. >> >> >> >> >> >> gdb/ >> >> >> 2018-10-10 Jan Beulich <jbeulich@suse.com> >> >> >> >> >> >> * i387-tdep.c (i387_supply_xsave): Split handling of >> >> >> X86_XSTATE_ZMM_H and X86_XSTATE_ZMM. >> >> >> (i387_collect_xsave): Likewise. >> >> >> >> >> >> gdb/testsuite/ >> >> >> 2018-10-10 Simon Marchi <simon.marchi@polymtl.ca> >> >> >> >> >> >> * testsuite/gdb.arch/i386-avx512.c, >> >> >> testsuite/gdb.arch/i386-avx512.exp: Add 7th test. >> >> >> >> >> >> --- >> >> >> v2: Attach comments to zmm_endlo_regnum declarations. Add testcase >> >> >> provided by Simon. >> >> >> >> The testcase obviously LGTM. I will let Markus approve the other changes. >> > >> > The code already looked good to me in v1. Thanks for adding comments. >> >> So can I translate this into an ack for me to commit the change? >> Or else, who would be the one to give the go-ahead? > > Simon can approve your patch. > > IIRC Pedro had a question regarding gdbserver. From a first look, it seems to get the > feature bits right but does not distinguish 32-bit and 64-bit mode regarding the number > of available registers. If there was a problem there, it was my understanding that this can (and should) be fixed in a separate patch, by someone able to test this. > Did you run the new test you added also in remote configuration? No, and I also have no idea how I would go about doing so. Jan
On 2018-11-07 8:25 a.m., Jan Beulich wrote: >>> So can I translate this into an ack for me to commit the change? >>> Or else, who would be the one to give the go-ahead? >> >> Simon can approve your patch. If Markus is happy with the comments, it's all fine with me, please push (I see you are listed in MAINTAINERS, so I assume you have push access). >> >> IIRC Pedro had a question regarding gdbserver. From a first look, it seems to get the >> feature bits right but does not distinguish 32-bit and 64-bit mode regarding the number >> of available registers. > > If there was a problem there, it was my understanding that this can > (and should) be fixed in a separate patch, by someone able to test > this. Well, you are able to test it (see commands below) :). But indeed, I think we should push this patch, which already improves GDB. It would be nice to at least get an idea of the status of the test when ran with gdbserver. Since access to this hardware is quite limited, it would be appreciated if you could give it a quick shot and report the results. >> Did you run the new test you added also in remote configuration? > > No, and I also have no idea how I would go about doing so. Run: $ make check TESTS="gdb.arch/i386-avx512.exp" RUNTESTFLAGS="--target_board=native-gdbserver" $ make check TESTS="gdb.arch/i386-avx512.exp" RUNTESTFLAGS="--target_board=native-extended-gdbserver" in the gdb/ build directory. Thanks, Simon
>>> On 07.11.18 at 15:18, <simon.marchi@ericsson.com> wrote: > On 2018-11-07 8:25 a.m., Jan Beulich wrote: >>>> So can I translate this into an ack for me to commit the change? >>>> Or else, who would be the one to give the go-ahead? >>> >>> Simon can approve your patch. > > If Markus is happy with the comments, it's all fine with me, please push (I see > you are listed in MAINTAINERS, so I assume you have push access). Done. >>> IIRC Pedro had a question regarding gdbserver. From a first look, it seems to get the >>> feature bits right but does not distinguish 32-bit and 64-bit mode regarding the number >>> of available registers. >> >> If there was a problem there, it was my understanding that this can >> (and should) be fixed in a separate patch, by someone able to test >> this. > > Well, you are able to test it (see commands below) :). But indeed, I think we > should push this patch, which already improves GDB. It would be nice to at least > get an idea of the status of the test when ran with gdbserver. Since access to > this hardware is quite limited, it would be appreciated if you could give it a > quick shot and report the results. > >>> Did you run the new test you added also in remote configuration? >> >> No, and I also have no idea how I would go about doing so. > > Run: > > $ make check TESTS="gdb.arch/i386-avx512.exp" RUNTESTFLAGS="--target_board=native-gdbserver" I'm afraid there's more broken here than just this one test. Log file attached. I hope you can make sense of it. > $ make check TESTS="gdb.arch/i386-avx512.exp" RUNTESTFLAGS="--target_board=native-extended-gdbserver" This produces effectively the same (bad) results. Jan
On 2018-11-08 6:15 a.m., Jan Beulich wrote: > I'm afraid there's more broken here than just this one test. > Log file attached. I hope you can make sense of it. > >> $ make check TESTS="gdb.arch/i386-avx512.exp" RUNTESTFLAGS="--target_board=native-extended-gdbserver" > > This produces effectively the same (bad) results. Thanks I opened: https://sourceware.org/bugzilla/show_bug.cgi?id=23873 Simon
--- a/gdb/i387-tdep.c +++ b/gdb/i387-tdep.c @@ -923,7 +923,8 @@ i387_supply_xsave (struct regcache *regc enum bfd_endian byte_order = gdbarch_byte_order (gdbarch); struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch); const gdb_byte *regs = (const gdb_byte *) xsave; - int i; + int i, zmm_endlo_regnum = I387_ZMM0H_REGNUM (tdep) + + std::min (tdep->num_zmm_regs, 16); ULONGEST clear_bv; static const gdb_byte zero[I386_MAX_REGISTER_SIZE] = { 0 }; enum @@ -1002,7 +1003,8 @@ i387_supply_xsave (struct regcache *regc return; case avx512_zmm_h: - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) + if ((clear_bv & (regnum < zmm_endlo_regnum ? X86_XSTATE_ZMM_H + : X86_XSTATE_ZMM))) regcache->raw_supply (regnum, zero); else regcache->raw_supply (regnum, @@ -1080,21 +1082,17 @@ i387_supply_xsave (struct regcache *regc } } - /* Handle the upper ZMM registers. */ - if ((tdep->xcr0 & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) + /* Handle the upper halves of the low 8/16 ZMM registers. */ + if ((tdep->xcr0 & X86_XSTATE_ZMM_H)) { - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) + if ((clear_bv & X86_XSTATE_ZMM_H)) { - for (i = I387_ZMM0H_REGNUM (tdep); - i < I387_ZMMENDH_REGNUM (tdep); - i++) + for (i = I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) regcache->raw_supply (i, zero); } else { - for (i = I387_ZMM0H_REGNUM (tdep); - i < I387_ZMMENDH_REGNUM (tdep); - i++) + for (i = I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) regcache->raw_supply (i, XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i)); } @@ -1119,11 +1117,13 @@ i387_supply_xsave (struct regcache *regc } } - /* Handle the YMM_AVX512 registers. */ + /* Handle the upper 16 ZMM/YMM/XMM registers (if any). */ if ((tdep->xcr0 & X86_XSTATE_ZMM)) { if ((clear_bv & X86_XSTATE_ZMM)) { + for (i = zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) + regcache->raw_supply (i, zero); for (i = I387_YMM16H_REGNUM (tdep); i < I387_YMMH_AVX512_END_REGNUM (tdep); i++) @@ -1135,6 +1135,9 @@ i387_supply_xsave (struct regcache *regc } else { + for (i = zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) + regcache->raw_supply (i, + XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i)); for (i = I387_YMM16H_REGNUM (tdep); i < I387_YMMH_AVX512_END_REGNUM (tdep); i++) @@ -1340,7 +1343,8 @@ i387_collect_xsave (const struct regcach gdb_byte *p, *regs = (gdb_byte *) xsave; gdb_byte raw[I386_MAX_REGISTER_SIZE]; ULONGEST initial_xstate_bv, clear_bv, xstate_bv = 0; - unsigned int i; + unsigned int i, zmm_endlo_regnum = I387_ZMM0H_REGNUM (tdep) + + std::min (tdep->num_zmm_regs, 16); enum { x87_ctrl_or_mxcsr = 0x1, @@ -1441,9 +1445,8 @@ i387_collect_xsave (const struct regcach i < I387_MPXEND_REGNUM (tdep); i++) memset (XSAVE_MPX_ADDR (tdep, regs, i), 0, 8); - if ((clear_bv & (X86_XSTATE_ZMM_H | X86_XSTATE_ZMM))) - for (i = I387_ZMM0H_REGNUM (tdep); - i < I387_ZMMENDH_REGNUM (tdep); i++) + if ((clear_bv & X86_XSTATE_ZMM_H)) + for (i = I387_ZMM0H_REGNUM (tdep); i < zmm_endlo_regnum; i++) memset (XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i), 0, 32); if ((clear_bv & X86_XSTATE_K)) @@ -1453,6 +1456,8 @@ i387_collect_xsave (const struct regcach if ((clear_bv & X86_XSTATE_ZMM)) { + for (i = zmm_endlo_regnum; i < I387_ZMMENDH_REGNUM (tdep); i++) + memset (XSAVE_AVX512_ZMM_H_ADDR (tdep, regs, i), 0, 32); for (i = I387_YMM16H_REGNUM (tdep); i < I387_YMMH_AVX512_END_REGNUM (tdep); i++) memset (XSAVE_YMM_AVX512_ADDR (tdep, regs, i), 0, 16);