[gdb/testsuite] Add have_linux_btrace_bug

Message ID 20230213181001.25142-1-tdevries@suse.de
State Rejected
Headers
Series [gdb/testsuite] Add have_linux_btrace_bug |

Commit Message

Tom de Vries Feb. 13, 2023, 6:10 p.m. UTC
  The linux kernel commit 670638477aed ("perf/x86/intel/pt: Opportunistically
use single range output mode"), added in version v5.5.0 had a bug that was
fixed by commit ce0d998be927 ("perf/x86/intel/pt: Fix sampling using
single range output") in version 6.1.0.

The bug manifested for intel microarchitectures Rocket Lake, Raptor Lake and
Alder Lake.

Detect this set of conditions in a new proc have_linux_btrace_bug, and use it
in allow_btrace_tests.

I was initially planning to do just a require !have_linux_btrace_bug in the
failing test-cases, and that looked ok for PR30065 (with libipt) with just one
test-case failing, but there are a lot of fails for PR30073 (without libipt).

Tested on x86_64-linux.

PR testsuite/30073
PR testsuite/30075
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30073
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30075
---
 gdb/testsuite/lib/gdb.exp | 103 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 103 insertions(+)


base-commit: 14d0e6818a022b72c265f15f63c8ccc2fc8c302a
  

Comments

Terekhov, Mikhail via Gdb-patches Feb. 13, 2023, 7:17 p.m. UTC | #1
Hello Tom,

>The linux kernel commit 670638477aed ("perf/x86/intel/pt: Opportunistically
>use single range output mode"), added in version v5.5.0 had a bug that was
>fixed by commit ce0d998be927 ("perf/x86/intel/pt: Fix sampling using
>single range output") in version 6.1.0.
>
>The bug manifested for intel microarchitectures Rocket Lake, Raptor Lake and
>Alder Lake.

Actually, it's a h/w bug that got exposed by using single-range output.  It affects
Core processors starting from Ice Lake and it only affects Processor Trace.  Also,
it is only exposed by the py-record-btrace test, which does a lot of single-stepping.

It might be better to just add an XFAIL for that one test.  I'm not sure if maintaining
a processor list makes sense.  The kernel patch disables single-range for > 1 page
for all processors and does not try to maintain a list of affected processors.  We might
want to do the same in GDB and either disable that test for kernels between 5.5 and
6.1, or setup an XFAIL.

The Branch Trace Store issue you found seems to affect all btrace tests on ADL
E-cores.  This is a different issue.  I can reproduce it and I am currently debugging it.

Regards,
Markus.

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  

Patch

diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index 7f98f080328..a76397f50b8 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -3851,6 +3851,10 @@  gdb_caching_proc allow_btrace_tests {
     gdb_exit
     remote_file build delete $obj
 
+    if { $allow_btrace_tests } {
+	set allow_btrace_tests [expr ![have_linux_btrace_bug]]
+    }
+
     verbose "$me:  returning $allow_btrace_tests" 2
     return $allow_btrace_tests
 }
@@ -9374,5 +9378,104 @@  proc has_dependency { file dep } {
     return [regexp $dep $output]
 }
 
+# Return 1 if the linux kernel btrace bug introduced in kernel commit
+# 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output
+# mode"), may manifest.
+
+gdb_caching_proc have_linux_btrace_bug {
+    set me "have_linux_btrace_bug"
+
+    if { ![istarget "i?86-*-*"] && ![istarget "x86_64-*-*"] } {
+	return 0
+    }
+
+    if { ![istarget *-*-linux*] } {
+	return 0
+    }
+
+    set res [remote_exec target "uname -r"]
+    set status [lindex $res 0]
+    set output [lindex $res 1]
+    if { $status != 0 } {
+	return 0
+    }
+
+    set re ^($::decimal)\\.($::decimal)\\.($::decimal)
+    if { [regexp $re $output dummy v1 v2 v3] != 1 } {
+	return 0
+    }
+    set v [list $v1 $v2 $v3]
+
+    set affected_version \
+	[expr [version_compare [list 5 5 0] <= $v] \
+	     && [version_compare $v < [list 6 1 0]]]
+    if { ! $affected_version } {
+	return 0
+    }
+
+    # Compile a test program.
+    set src {
+	#include "nat/x86-gcc-cpuid.h"
+
+	int main() {
+	  unsigned int eax, ebx, ecx, edx;
+
+	  if (!__get_cpuid (0, &eax, &ebx, &ecx, &edx))
+	    return 0;
+
+	  int intel_p = (ebx == signature_INTEL_ebx
+			 && ecx == signature_INTEL_ecx
+			 && edx == signature_INTEL_edx);
+
+	  if (!intel_p)
+	    return 0;
+
+	  if (! __get_cpuid (1, &eax, &ebx, &ecx, &edx))
+	    return 0;
+
+	  unsigned int ex_fam_id = (eax >> 20) & 0xff;
+	  unsigned int ex_mod_id = (eax >> 16) & 0xf;
+	  unsigned int fam_id = (eax >> 8) & 0xf;
+	  unsigned int model = (eax >> 4) & 0xf;
+
+	  if (fam_id == 6 || fam_id == 15)
+	    model = model + (ex_mod_id << 4);
+	  if (fam_id == 15)
+	    fam_id = fam_id + ex_fam_id;
+
+	  if (fam_id == 6)
+	    {
+	      /* Rocket Lake.  */
+	      if (model == 167)
+		return 1;
+	      /* Alder Lake.  */
+	      if (model == 151 || model == 154)
+		return 1;
+	      /* Raptor Lake.  */
+	      if (model == 183)
+		return 1;
+	  }
+
+	  return 0;
+	}
+    }
+
+    set flags "incdir=$::srcdir/.."
+    if { ! [gdb_simple_compile $me $src executable $flags] } {
+	return 0
+    }
+
+    set result [remote_exec target $obj]
+    set status [lindex $result 0]
+    set output [lindex $result 1]
+    if { $output != "" } {
+	set status 0
+    }
+
+    remote_file build delete $obj
+
+    return $status
+}
+
 # Always load compatibility stuff.
 load_lib future.exp