From patchwork Mon Sep 23 15:15:41 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Guinevere Larsen <guinevere@redhat.com>
X-Patchwork-Id: 97870
Return-Path: <gdb-patches-bounces~patchwork=sourceware.org@sourceware.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 85C9A3858C5F
	for <patchwork@sourceware.org>; Mon, 23 Sep 2024 15:18:06 +0000 (GMT)
X-Original-To: gdb-patches@sourceware.org
Delivered-To: gdb-patches@sourceware.org
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTP id 03E4C3858D39
 for <gdb-patches@sourceware.org>; Mon, 23 Sep 2024 15:16:53 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 03E4C3858D39
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 03E4C3858D39
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=170.10.133.124
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727104619; cv=none;
 b=O9JM/v+VS97GfU5y1gIpoH0kZ5v+oibA5RNcRhKjnWQncL1o2YXvPhNGLaqf0mtCwBDzZcL6CeaaWWyy2m6eiXR2dNz9CnczPWCSaemOaISS4RnkScniO+6vPzCB2WsE52GYeESgPGq1BUdFh5VuoEz262SroAZE1xPjOPHYiXk=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1727104619; c=relaxed/simple;
 bh=VEMjUqEwjU1NAHKgOe0uS8Af/gS6wvZ/cK6yZuoeHDQ=;
 h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;
 b=OHRIwvhMgDdT+hkaEo0nDJobiChEucl0kLRV+prvJ7rthPN56jH9V+HswYxoANXWXjgXiGc+lafFRvmdQQAOXsJAtYPQJE3l1vnVh2Q7sG6d7R1/tyY896T8dsg5KIZSuToHxUo6+Ln2Ih+yqWAL1e6qCZl01rKvqWyBmubjJpM=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1727104613;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=nrqWUNLv7nVoboFm+r46FIQMuZlIwOibTGIftBZU5jo=;
 b=P4TRnS9S3lPjC1OPEGdgn0X5SXfEmPTLSQejhWwhsS9i5/xdEA9GmI3W6hBi22bPyRvLnn
 wuu91hP1qrzUn6cc80N5yuW4OxVtjrar6tl3qc2LHtiJ5Wd9p2DhlrermVSCzX+ErT0MI/
 tg3mYMzbpVjLirbcmD98BwAO/Eg8Or8=
Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com
 (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by
 relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3,
 cipher=TLS_AES_256_GCM_SHA384) id us-mta-332-s-lS6vBoPqmXvb3pekvySQ-1; Mon,
 23 Sep 2024 11:16:44 -0400
X-MC-Unique: s-lS6vBoPqmXvb3pekvySQ-1
Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (unknown
 [10.30.177.40])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest
 SHA256)
 (No client certificate requested)
 by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS
 id A4BD81896E1D
 for <gdb-patches@sourceware.org>; Mon, 23 Sep 2024 15:16:43 +0000 (UTC)
Received: from fedora.redhat.com (unknown [10.96.134.93])
 by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with
 ESMTPS
 id F296D1955D42; Mon, 23 Sep 2024 15:16:32 +0000 (UTC)
From: Guinevere Larsen <guinevere@redhat.com>
To: gdb-patches@sourceware.org
Cc: Guinevere Larsen <guinevere@redhat.com>
Subject: [PATCH v4 6/7] gdb/record: support AVX instructions VMOVDQ(U|A) when
 recording
Date: Mon, 23 Sep 2024 12:15:41 -0300
Message-ID: <20240923151541.616723-8-guinevere@redhat.com>
In-Reply-To: <20240923151541.616723-2-guinevere@redhat.com>
References: <20240923151541.616723-2-guinevere@redhat.com>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,
 SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
Errors-To: gdb-patches-bounces~patchwork=sourceware.org@sourceware.org

This commit adds support for the instructions VMOVDQU and VMOVDQA, used
to move values to/from 256 bit registers. Unfortunately, the
programmer's manual is very incomplete (if not wrong) about these
instructions, so the logic had to be reverse engineered from how gcc
actually encodes the instruction.

This commit also changes the memory regions from the test to store 256
bits, so its easier to test the instructions and that we're recording
ymm registers correctly.
---
 gdb/i386-tdep.c                               | 49 +++++++++++++++++++
 gdb/testsuite/gdb.reverse/i386-avx-reverse.c  | 42 ++++++++++++++--
 .../gdb.reverse/i386-avx-reverse.exp          | 28 +++++++++++
 3 files changed, 115 insertions(+), 4 deletions(-)

diff --git a/gdb/i386-tdep.c b/gdb/i386-tdep.c
index e4e808cf4b0..6d3b98dc302 100644
--- a/gdb/i386-tdep.c
+++ b/gdb/i386-tdep.c
@@ -5071,6 +5071,55 @@ i386_record_vex (struct i386_record_s *ir, uint8_t vex_w, uint8_t vex_r,
 	}
       break;
 
+    case 0x6f: /* VMOVDQ (U|A)  */
+    case 0x7f: /* VMOVDQ (U|A)  */
+      /* vmovdq instructions have information about source/destination
+	 spread over many places, so this code ended up messier than
+	 I'd like.  */
+      /* The VEX.pp bits identify if the move is aligned or not, but this
+	 doesn't influence the recording so we can ignore it.  */
+      i386_record_modrm (ir);
+      /* The first bit of modrm identifies if both operands of the instruction
+	 are registers (bit = 1) or if one of the operands is memory.  */
+      if (ir->mod & 2)
+	{
+	  if (opcode == 0x6f)
+	    {
+	      /* vex_r will identify the high bit of the destination
+		 register.  Source is identified by ir->rex_b, but that
+		 doesn't matter for recording.  */
+	      record_full_arch_list_add_reg (ir->regcache,
+					     tdep->ymm0_regnum + 8*vex_r + ir->reg);
+	    }
+	  else
+	    {
+	      /* The origin operand is >7 and destination operand is <= 7.
+		 This is special cased because in this one vex_r is used to
+		 identify the high bit of the SOURCE operand, not destination
+		 which would mess the previous expression.  */
+	      record_full_arch_list_add_reg (ir->regcache,
+					     tdep->ymm0_regnum + ir->rm);
+	    }
+	}
+      else
+	{
+	  /* This is the easy branch.  We just need to check the opcode
+	     to see if the source or destination is memory.  */
+	  if (opcode == 0x6f)
+	    {
+	      record_full_arch_list_add_reg (ir->regcache,
+					     tdep->ymm0_regnum
+					      + ir->reg + vex_r * 8);
+	    }
+	  else
+	    {
+	      /* We're writing 256 bits, so 1<<8.  */
+	      ir->ot = 8;
+	      i386_record_lea_modrm (ir);
+	    }
+	}
+      break;
+
     case 0x60:	/* VPUNPCKLBW  */
     case 0x61:	/* VPUNPCKLWD  */
     case 0x62:	/* VPUNPCKLDQ  */
diff --git a/gdb/testsuite/gdb.reverse/i386-avx-reverse.c b/gdb/testsuite/gdb.reverse/i386-avx-reverse.c
index 16303a42248..87574983c8a 100644
--- a/gdb/testsuite/gdb.reverse/i386-avx-reverse.c
+++ b/gdb/testsuite/gdb.reverse/i386-avx-reverse.c
@@ -20,8 +20,12 @@
 #include <stdlib.h>
 
 char global_buf0[] = {0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
+		      0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f,
+		      0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
 		      0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f};
 char global_buf1[] = {0, 0, 0, 0, 0, 0, 0, 0,
+		      0, 0, 0, 0, 0, 0, 0, 0,
+		      0, 0, 0, 0, 0, 0, 0, 0,
 		      0, 0, 0, 0, 0, 0, 0, 0};
 char *dyn_buf0;
 char *dyn_buf1;
@@ -30,8 +34,12 @@ int
 vmov_test ()
 {
   char buf0[] = {0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37,
+		 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f,
+		 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37,
 		 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f};
   char buf1[] = {0, 0, 0, 0, 0, 0, 0, 0,
+		 0, 0, 0, 0, 0, 0, 0, 0,
+		 0, 0, 0, 0, 0, 0, 0, 0,
 		 0, 0, 0, 0, 0, 0, 0, 0};
 
   /*start vmov_test.  */
@@ -73,6 +81,32 @@ vmov_test ()
   asm volatile ("vmovq %0, %%xmm15": "=m" (buf0));
   asm volatile ("vmovq %0, %%xmm15": "=m" (buf1));
 
+  /* Test vmovdq style instructions.  */
+  /* For local and dynamic buffers, we can't guarantee they will be aligned.
+     However, the aligned and unaligned versions seem to be encoded the same,
+     so testing one is enough to validate both.  */
+
+  /* Operations based on local buffers.  */
+  asm volatile ("vmovdqu %0, %%ymm0": : "m"(buf0));
+  asm volatile ("vmovdqu %%ymm0, %0": "=m"(buf1));
+
+  /* Operations based on global buffers.  */
+  /* Global buffers seem to always be aligned, lets sanity check vmovdqa.  */
+  asm volatile ("vmovdqa %0, %%ymm15": : "m"(global_buf0));
+  asm volatile ("vmovdqa %%ymm15, %0": "=m"(global_buf1));
+  asm volatile ("vmovdqu %0, %%ymm0": : "m"(global_buf0));
+  asm volatile ("vmovdqu %%ymm0, %0": "=m"(global_buf1));
+
+  /* Operations based on dynamic buffers.  */
+  /* The dynamic buffers are not aligned, so we skip vmovdqa.  */
+  asm volatile ("vmovdqu %0, %%ymm0": : "m"(*dyn_buf0));
+  asm volatile ("vmovdqu %%ymm0, %0": "=m"(*dyn_buf1));
+
+  /* Operations between 2 registers.  */
+  asm volatile ("vmovdqu %ymm15, %ymm0");
+  asm volatile ("vmovdqu %ymm2, %ymm15");
+  asm volatile ("vmovdqa %ymm15, %ymm0");
+
   /* We have a return statement to deal with
      epilogue in different compilers.  */
   return 0; /* end vmov_test */
@@ -161,11 +195,11 @@ vpbroadcast_test ()
 int
 main ()
 {
-  dyn_buf0 = (char *) malloc(sizeof(char) * 16);
-  dyn_buf1 = (char *) malloc(sizeof(char) * 16);
-  for (int i =0; i < 16; i++)
+  dyn_buf0 = (char *) malloc(sizeof(char) * 32);
+  dyn_buf1 = (char *) malloc(sizeof(char) * 32);
+  for (int i =0; i < 32; i++)
     {
-      dyn_buf0[i] = 0x20 + i;
+      dyn_buf0[i] = 0x20 + (i % 16);
       dyn_buf1[i] = 0;
     }
   /* Zero relevant xmm registers, se we know what to look for.  */
diff --git a/gdb/testsuite/gdb.reverse/i386-avx-reverse.exp b/gdb/testsuite/gdb.reverse/i386-avx-reverse.exp
index 75c313c2225..aea5e395cf8 100644
--- a/gdb/testsuite/gdb.reverse/i386-avx-reverse.exp
+++ b/gdb/testsuite/gdb.reverse/i386-avx-reverse.exp
@@ -134,6 +134,34 @@ global decimal
 
 if {[record_full_function "vmov"] == true} {
     # Now execute backwards, checking all instructions.
+    test_one_register "vmovdqa" "ymm0" \
+	"0x1f1e1d1c1b1a19181716151413121110, 0x1f1e1d1c1b1a19181716151413121110" \
+	"from register: "
+    test_one_register "vmovdqu" "ymm15" \
+	"0x1f1e1d1c1b1a19181716151413121110, 0x1f1e1d1c1b1a19181716151413121110" \
+	"from register: "
+    test_one_register "vmovdqu" "ymm0" \
+	"0x2f2e2d2c2b2a29282726252423222120, 0x2f2e2d2c2b2a29282726252423222120" \
+	"from register: "
+
+    test_one_memory "vmovdqu" "dyn_buf1" "0x0 .repeats 32 times" \
+	true "dynamic buffer: "
+    test_one_register "vmovdqu" "ymm0" \
+	"0x1f1e1d1c1b1a19181716151413121110, 0x1f1e1d1c1b1a19181716151413121110" \
+	"dynamic buffer: "
+
+    # Don't check the full buffer because that'd be too long
+    test_one_memory "vmovdqu" "global_buf1" \
+	"0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19" \
+	"global buffer: "
+    test_one_register "vmovdqu" "ymm0" \
+	"0x3f3e3d3c3b3a39383736353433323130, 0x3f3e3d3c3b3a39383736353433323130" \
+	"global buffer: "
+    test_one_memory "vmovdqa" "global_buf1" "0x0 .repeats 32 times"
+    test_one_register "vmovdqa" "ymm15" "0x0, 0x0"
+
+    test_one_memory "vmovdqu" "buf1" "0x0 .repeats 32 times"
+    test_one_register "vmovdqu" "ymm0" "0x2726252423222120, 0x0" "local buffer: "
 
     test_one_register "vmovq" "xmm15" "0x3736353433323130" "reg_reset: "
     test_one_register "vmovq" "xmm15" "0x0"