From patchwork Mon Sep 23 15:15:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Guinevere Larsen X-Patchwork-Id: 97870 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 85C9A3858C5F for ; Mon, 23 Sep 2024 15:18:06 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 03E4C3858D39 for ; Mon, 23 Sep 2024 15:16:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 03E4C3858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 03E4C3858D39 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727104619; cv=none; b=O9JM/v+VS97GfU5y1gIpoH0kZ5v+oibA5RNcRhKjnWQncL1o2YXvPhNGLaqf0mtCwBDzZcL6CeaaWWyy2m6eiXR2dNz9CnczPWCSaemOaISS4RnkScniO+6vPzCB2WsE52GYeESgPGq1BUdFh5VuoEz262SroAZE1xPjOPHYiXk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727104619; c=relaxed/simple; bh=VEMjUqEwjU1NAHKgOe0uS8Af/gS6wvZ/cK6yZuoeHDQ=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=OHRIwvhMgDdT+hkaEo0nDJobiChEucl0kLRV+prvJ7rthPN56jH9V+HswYxoANXWXjgXiGc+lafFRvmdQQAOXsJAtYPQJE3l1vnVh2Q7sG6d7R1/tyY896T8dsg5KIZSuToHxUo6+Ln2Ih+yqWAL1e6qCZl01rKvqWyBmubjJpM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1727104613; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nrqWUNLv7nVoboFm+r46FIQMuZlIwOibTGIftBZU5jo=; b=P4TRnS9S3lPjC1OPEGdgn0X5SXfEmPTLSQejhWwhsS9i5/xdEA9GmI3W6hBi22bPyRvLnn wuu91hP1qrzUn6cc80N5yuW4OxVtjrar6tl3qc2LHtiJ5Wd9p2DhlrermVSCzX+ErT0MI/ tg3mYMzbpVjLirbcmD98BwAO/Eg8Or8= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-332-s-lS6vBoPqmXvb3pekvySQ-1; Mon, 23 Sep 2024 11:16:44 -0400 X-MC-Unique: s-lS6vBoPqmXvb3pekvySQ-1 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (unknown [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A4BD81896E1D for ; Mon, 23 Sep 2024 15:16:43 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.96.134.93]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id F296D1955D42; Mon, 23 Sep 2024 15:16:32 +0000 (UTC) From: Guinevere Larsen To: gdb-patches@sourceware.org Cc: Guinevere Larsen Subject: [PATCH v4 6/7] gdb/record: support AVX instructions VMOVDQ(U|A) when recording Date: Mon, 23 Sep 2024 12:15:41 -0300 Message-ID: <20240923151541.616723-8-guinevere@redhat.com> In-Reply-To: <20240923151541.616723-2-guinevere@redhat.com> References: <20240923151541.616723-2-guinevere@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces~patchwork=sourceware.org@sourceware.org This commit adds support for the instructions VMOVDQU and VMOVDQA, used to move values to/from 256 bit registers. Unfortunately, the programmer's manual is very incomplete (if not wrong) about these instructions, so the logic had to be reverse engineered from how gcc actually encodes the instruction. This commit also changes the memory regions from the test to store 256 bits, so its easier to test the instructions and that we're recording ymm registers correctly. --- gdb/i386-tdep.c | 49 +++++++++++++++++++ gdb/testsuite/gdb.reverse/i386-avx-reverse.c | 42 ++++++++++++++-- .../gdb.reverse/i386-avx-reverse.exp | 28 +++++++++++ 3 files changed, 115 insertions(+), 4 deletions(-) diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c index e4e808cf4b0..6d3b98dc302 100644 --- a/gdb/i386-tdep.c +++ b/gdb/i386-tdep.c @@ -5071,6 +5071,55 @@ i386_record_vex (struct i386_record_s *ir, uint8_t vex_w, uint8_t vex_r, } break; + case 0x6f: /* VMOVDQ (U|A) */ + case 0x7f: /* VMOVDQ (U|A) */ + /* vmovdq instructions have information about source/destination + spread over many places, so this code ended up messier than + I'd like. */ + /* The VEX.pp bits identify if the move is aligned or not, but this + doesn't influence the recording so we can ignore it. */ + i386_record_modrm (ir); + /* The first bit of modrm identifies if both operands of the instruction + are registers (bit = 1) or if one of the operands is memory. */ + if (ir->mod & 2) + { + if (opcode == 0x6f) + { + /* vex_r will identify the high bit of the destination + register. Source is identified by ir->rex_b, but that + doesn't matter for recording. */ + record_full_arch_list_add_reg (ir->regcache, + tdep->ymm0_regnum + 8*vex_r + ir->reg); + } + else + { + /* The origin operand is >7 and destination operand is <= 7. + This is special cased because in this one vex_r is used to + identify the high bit of the SOURCE operand, not destination + which would mess the previous expression. */ + record_full_arch_list_add_reg (ir->regcache, + tdep->ymm0_regnum + ir->rm); + } + } + else + { + /* This is the easy branch. We just need to check the opcode + to see if the source or destination is memory. */ + if (opcode == 0x6f) + { + record_full_arch_list_add_reg (ir->regcache, + tdep->ymm0_regnum + + ir->reg + vex_r * 8); + } + else + { + /* We're writing 256 bits, so 1<<8. */ + ir->ot = 8; + i386_record_lea_modrm (ir); + } + } + break; + case 0x60: /* VPUNPCKLBW */ case 0x61: /* VPUNPCKLWD */ case 0x62: /* VPUNPCKLDQ */ diff --git a/gdb/testsuite/gdb.reverse/i386-avx-reverse.c b/gdb/testsuite/gdb.reverse/i386-avx-reverse.c index 16303a42248..87574983c8a 100644 --- a/gdb/testsuite/gdb.reverse/i386-avx-reverse.c +++ b/gdb/testsuite/gdb.reverse/i386-avx-reverse.c @@ -20,8 +20,12 @@ #include char global_buf0[] = {0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, + 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, + 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f}; char global_buf1[] = {0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; char *dyn_buf0; char *dyn_buf1; @@ -30,8 +34,12 @@ int vmov_test () { char buf0[] = {0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, + 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f, + 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f}; char buf1[] = {0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; /*start vmov_test. */ @@ -73,6 +81,32 @@ vmov_test () asm volatile ("vmovq %0, %%xmm15": "=m" (buf0)); asm volatile ("vmovq %0, %%xmm15": "=m" (buf1)); + /* Test vmovdq style instructions. */ + /* For local and dynamic buffers, we can't guarantee they will be aligned. + However, the aligned and unaligned versions seem to be encoded the same, + so testing one is enough to validate both. */ + + /* Operations based on local buffers. */ + asm volatile ("vmovdqu %0, %%ymm0": : "m"(buf0)); + asm volatile ("vmovdqu %%ymm0, %0": "=m"(buf1)); + + /* Operations based on global buffers. */ + /* Global buffers seem to always be aligned, lets sanity check vmovdqa. */ + asm volatile ("vmovdqa %0, %%ymm15": : "m"(global_buf0)); + asm volatile ("vmovdqa %%ymm15, %0": "=m"(global_buf1)); + asm volatile ("vmovdqu %0, %%ymm0": : "m"(global_buf0)); + asm volatile ("vmovdqu %%ymm0, %0": "=m"(global_buf1)); + + /* Operations based on dynamic buffers. */ + /* The dynamic buffers are not aligned, so we skip vmovdqa. */ + asm volatile ("vmovdqu %0, %%ymm0": : "m"(*dyn_buf0)); + asm volatile ("vmovdqu %%ymm0, %0": "=m"(*dyn_buf1)); + + /* Operations between 2 registers. */ + asm volatile ("vmovdqu %ymm15, %ymm0"); + asm volatile ("vmovdqu %ymm2, %ymm15"); + asm volatile ("vmovdqa %ymm15, %ymm0"); + /* We have a return statement to deal with epilogue in different compilers. */ return 0; /* end vmov_test */ @@ -161,11 +195,11 @@ vpbroadcast_test () int main () { - dyn_buf0 = (char *) malloc(sizeof(char) * 16); - dyn_buf1 = (char *) malloc(sizeof(char) * 16); - for (int i =0; i < 16; i++) + dyn_buf0 = (char *) malloc(sizeof(char) * 32); + dyn_buf1 = (char *) malloc(sizeof(char) * 32); + for (int i =0; i < 32; i++) { - dyn_buf0[i] = 0x20 + i; + dyn_buf0[i] = 0x20 + (i % 16); dyn_buf1[i] = 0; } /* Zero relevant xmm registers, se we know what to look for. */ diff --git a/gdb/testsuite/gdb.reverse/i386-avx-reverse.exp b/gdb/testsuite/gdb.reverse/i386-avx-reverse.exp index 75c313c2225..aea5e395cf8 100644 --- a/gdb/testsuite/gdb.reverse/i386-avx-reverse.exp +++ b/gdb/testsuite/gdb.reverse/i386-avx-reverse.exp @@ -134,6 +134,34 @@ global decimal if {[record_full_function "vmov"] == true} { # Now execute backwards, checking all instructions. + test_one_register "vmovdqa" "ymm0" \ + "0x1f1e1d1c1b1a19181716151413121110, 0x1f1e1d1c1b1a19181716151413121110" \ + "from register: " + test_one_register "vmovdqu" "ymm15" \ + "0x1f1e1d1c1b1a19181716151413121110, 0x1f1e1d1c1b1a19181716151413121110" \ + "from register: " + test_one_register "vmovdqu" "ymm0" \ + "0x2f2e2d2c2b2a29282726252423222120, 0x2f2e2d2c2b2a29282726252423222120" \ + "from register: " + + test_one_memory "vmovdqu" "dyn_buf1" "0x0 .repeats 32 times" \ + true "dynamic buffer: " + test_one_register "vmovdqu" "ymm0" \ + "0x1f1e1d1c1b1a19181716151413121110, 0x1f1e1d1c1b1a19181716151413121110" \ + "dynamic buffer: " + + # Don't check the full buffer because that'd be too long + test_one_memory "vmovdqu" "global_buf1" \ + "0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19" \ + "global buffer: " + test_one_register "vmovdqu" "ymm0" \ + "0x3f3e3d3c3b3a39383736353433323130, 0x3f3e3d3c3b3a39383736353433323130" \ + "global buffer: " + test_one_memory "vmovdqa" "global_buf1" "0x0 .repeats 32 times" + test_one_register "vmovdqa" "ymm15" "0x0, 0x0" + + test_one_memory "vmovdqu" "buf1" "0x0 .repeats 32 times" + test_one_register "vmovdqu" "ymm0" "0x2726252423222120, 0x0" "local buffer: " test_one_register "vmovq" "xmm15" "0x3736353433323130" "reg_reset: " test_one_register "vmovq" "xmm15" "0x0"