[RFC,2/3] Making ANSI terminal escape sequences work in TUI

Message ID 20180903184300.9961-3-tom@tromey.com
State New, archived
Headers

Commit Message

Tom Tromey Sept. 3, 2018, 6:42 p.m. UTC
  PR tui/14126 notes that ANSI terminal escape sequences don't affect
the colors shown in the TUI.  A simple way to see this is to try the
extended-prompt example from the gdb manual.

Curses does not pass escape sequences through to the terminal.
Instead, it replaces non-printable characters with a visible
representation, for example "^[" for the ESC character.

This patch fixes the problem by adding a simple ANSI terminal sequence
parser to gdb.  These sequences are decoded and those that are
recognized are turned into the appropriate curses calls.

The curses approach to color handling is unusual and so there are some
oddities in the implementation.

Standard curses has no notion of the default colors of the terminal.
So, if you set the foreground color, it is not possible to reset it --
you have to pick some other color.  ncurses provides an extension to
handle this, so this patch updates configure and uses it when
available.

Second, in curses, colors always come in pairs: you cannot set just
the foreground.  This patch handles this by tracking actually-used
pairs of colors and keeping a table of these for reuse.

Third, there are a limited number of such pairs available.  In this
patch, if you try to use too many color combinations, gdb will just
ignore some color changes.

Finally, in addition to limiting the number of color pairs, curses
also limits the number of colors.  This means that, when using
extended 8- or 24-bit color sequences, it may be possible to exhaust
the curses color table.

I am very sour on the curses design now.

I do not know how to write a test for this, so I did not.

gdb/ChangeLog
2018-09-03  Tom Tromey  <tom@tromey.com>

	PR tui/14126:
	* tui/tui.c (tui_enable): Call start_color and
	use_default_colors.
	* tui/tui-io.c (ansi_regex_text): New constant.
	(DATA_SUBEXP, FINAL_SUBEXP, NUM_SUBEXPRESSIONS): New defines.
	(ansi_regex): New global.
	(struct color_pair): New.
	(color_pair_map, last_color_pair): New globals.
	(tui_setup_io): Clean up color map when shutting down.
	(curses_colors): New constant.
	(get_color_pair, apply_ansi_escape): New functions.
	(tui_write): Rewrite.
	(tui_puts_internal): New function, from tui_puts.  Add "height"
	parameter.
	(tui_puts): Use tui_puts_internal.
	(tui_redisplay_readline): Use tui_puts_internal.
	(_initialize_tui_io): New function.
	(struct rgb_color): New.
	(color_map, bright_colors): New globals.
	(get_color): New function.
	(struct ansi_color): New.
	(read_set_type, extended_color, reset_attrs): New functions.
	* configure.ac: Check for use_default_colors.
	* config.in, configure: Rebuild.
---
 gdb/ChangeLog    |  27 +++
 gdb/config.in    |   3 +
 gdb/configure    |   2 +-
 gdb/configure.ac |   2 +-
 gdb/tui/tui-io.c | 555 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 gdb/tui/tui.c    |   9 +
 6 files changed, 580 insertions(+), 18 deletions(-)
  

Comments

Tom Tromey Oct. 11, 2018, 11:03 p.m. UTC | #1
>>>>> "Tom" == Tom Tromey <tom@tromey.com> writes:

Tom> This patch fixes the problem by adding a simple ANSI terminal sequence
Tom> parser to gdb.  These sequences are decoded and those that are
Tom> recognized are turned into the appropriate curses calls.

I've extended this to colorize the source display (and "list" output)
using GNU Source Highlight.  So I plan to push this series in soon, so I
can submit that.

GNU Source Highlight seems adequate, though under-maintained; it hasn't
had a release in a while and doesn't (gasp) have a Rust higlighter.  I'm
considering adding a Python API instead and switching to Pygments.  I
hate to lose the GNU connection, though.  Let me know what you think.

Tom
  

Patch

diff --git a/gdb/config.in b/gdb/config.in
index 01acda124b9..8ccffd9d882 100644
--- a/gdb/config.in
+++ b/gdb/config.in
@@ -540,6 +540,9 @@ 
 /* Define to 1 if you have the <unistd.h> header file. */
 #undef HAVE_UNISTD_H
 
+/* Define to 1 if you have the `use_default_colors' function. */
+#undef HAVE_USE_DEFAULT_COLORS
+
 /* Define to 1 if you have the `vfork' function. */
 #undef HAVE_VFORK
 
diff --git a/gdb/configure b/gdb/configure
index d207c2baf1b..72af1071e74 100755
--- a/gdb/configure
+++ b/gdb/configure
@@ -13268,7 +13268,7 @@  for ac_func in getauxval getrusage getuid getgid \
 		sigaction sigprocmask sigsetmask socketpair \
 		ttrace wborder wresize setlocale iconvlist libiconvlist btowc \
 		setrlimit getrlimit posix_madvise waitpid \
-		ptrace64 sigaltstack mkdtemp setns
+		ptrace64 sigaltstack mkdtemp setns use_default_colors
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/gdb/configure.ac b/gdb/configure.ac
index 13bc5f9a8f2..9065a660683 100644
--- a/gdb/configure.ac
+++ b/gdb/configure.ac
@@ -1355,7 +1355,7 @@  AC_CHECK_FUNCS([getauxval getrusage getuid getgid \
 		sigaction sigprocmask sigsetmask socketpair \
 		ttrace wborder wresize setlocale iconvlist libiconvlist btowc \
 		setrlimit getrlimit posix_madvise waitpid \
-		ptrace64 sigaltstack mkdtemp setns])
+		ptrace64 sigaltstack mkdtemp setns use_default_colors])
 AM_LANGINFO_CODESET
 GDB_AC_COMMON
 
diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c
index b1bda6e4f54..a3869028b63 100644
--- a/gdb/tui/tui-io.c
+++ b/gdb/tui/tui-io.c
@@ -40,6 +40,8 @@ 
 #include "filestuff.h"
 #include "completer.h"
 #include "gdb_curses.h"
+#include "gdb_regex.h"
+#include <map>
 
 /* This redefines CTRL if it is not already defined, so it must come
    after terminal state releated include files like <term.h> and
@@ -48,6 +50,34 @@ 
 
 static int tui_getc (FILE *fp);
 
+/* A regular expression that is used for matching ANSI terminal escape
+   sequences.  */
+
+static const char *ansi_regex_text =
+  /* Introduction.  */
+  "^\e\\["
+#define DATA_SUBEXP 1
+  /* Capture parameter and intermediate bytes.  */
+  "("
+  /* Parameter bytes.  */
+  "[\x30-\x3f]*"
+  /* Intermediate bytes.  */
+  "[\x20-\x2f]*"
+  /* End the first capture.  */
+  ")"
+  /* The final byte.  */
+#define FINAL_SUBEXP 2
+  "([\x40-\x7e])";
+
+/* The number of subexpressions to allocate space for, including the
+   "0th" whole match subexpression.  */
+#define NUM_SUBEXPRESSIONS 3
+
+/* The compiled form of ansi_regex_text.  */
+
+static regex_t ansi_regex;
+
+
 static int
 key_is_start_sequence (int ch)
 {
@@ -188,6 +218,458 @@  tui_putc (char c)
   update_cmdwin_start_line ();
 }
 
+/* This holds an RGB color.  */
+
+struct rgb_color
+{
+  short r;
+  short g;
+  short b;
+
+  bool operator< (const rgb_color &o) const
+  {
+    if (r < o.r)
+      return true;
+    else if (r == o.r)
+      {
+	if (g < o.g)
+	  return true;
+	else if (g == o.g)
+	  return b < o.b;
+      }
+    return false;
+  }
+};
+
+/* This maps RGB values to their corresponding color index.  */
+
+static std::map<rgb_color, int> color_map;
+
+/* This holds a pair of colors and is used to track the mapping
+   between a color pair index and the actual colors.  */
+
+struct color_pair
+{
+  int fg;
+  int bg;
+
+  bool operator< (const color_pair &o) const
+  {
+    return fg < o.fg || (fg == o.fg && bg < o.bg);
+  }
+};
+
+/* This maps pairs of colors to their corresponding color pair
+   index.  */
+
+static std::map<color_pair, int> color_pair_map;
+
+/* Given an RGB color, find its index.  */
+
+static bool
+get_color (rgb_color color, int *result)
+{
+  auto it = color_map.find (color);
+  if (it == color_map.end ())
+    {
+      /* The first 8 colors are standard.  */
+      int next = color_map.size () + 8;
+      if (next >= COLORS)
+	return false;
+      /* We store RGB as 0..255, but curses wants 0..1000.  */
+      if (init_color (next, color.r * 1000 / 255, color.g * 1000 / 255,
+		      color.b * 1000 / 255) == ERR)
+	return false;
+      color_map[color] = next;
+      *result = next;
+    }
+  else
+    *result = it->second;
+  return true;
+}
+
+/* This maps bright colors to RGB triples.  The index is the bright
+   color index, starting with bright black.  The values come from
+   xterm.  */
+
+static const rgb_color bright_colors[] = {
+  { 127, 127, 127 },		/* Black.  */
+  { 255, 0, 0 },		/* Red.  */
+  { 0, 255, 0 },		/* Green.  */
+  { 255, 255, 0 },		/* Yellow.  */
+  { 92, 92, 255 },		/* Blue.  */
+  { 255, 0, 255 },		/* Magenta.  */
+  { 0, 255, 255 },		/* Cyan.  */
+  { 255, 255, 255 }		/* White.  */
+};
+
+/* The most recently emitted color pair.  */
+
+static int last_color_pair = -1;
+
+/* Given two colors, return their color pair index; making a new one
+   if necessary.  */
+
+static int
+get_color_pair (int fg, int bg)
+{
+  color_pair c = { fg, bg };
+  auto it = color_pair_map.find (c);
+  if (it == color_pair_map.end ())
+    {
+      /* Color pair 0 is our default color, so new colors start at
+	 1.  */
+      int next = color_pair_map.size () + 1;
+      /* Curses has a limited number of available color pairs.  Fall
+	 back to the default if we've used too many.  */
+      if (next >= COLOR_PAIRS)
+	return 0;
+      init_pair (next, fg, bg);
+      color_pair_map[c] = next;
+      return next;
+    }
+  return it->second;
+}
+
+/* This is indexed by ANSI color offset from the base color, and holds
+   the corresponding curses color constant.  */
+
+static const int curses_colors[] = {
+  COLOR_BLACK,
+  COLOR_RED,
+  COLOR_GREEN,
+  COLOR_YELLOW,
+  COLOR_BLUE,
+  COLOR_MAGENTA,
+  COLOR_CYAN,
+  COLOR_WHITE,
+
+  /* Ignored - RGB colors are not handled through this table.  */
+  -1,
+
+  /* Default color.  */
+  -1
+};
+
+/* This represents a single color while parsing an escape sequence.
+   It can be set either directly from a standard index, or via RGB.
+   Later the color index can be fetched.  */
+
+struct ansi_color
+{
+  void reset ()
+  {
+    is_rgb = false;
+    index = -1;
+  }
+
+  void set (int v)
+  {
+    is_rgb = false;
+    index = v;
+  }
+
+  void set (rgb_color v)
+  {
+    is_rgb = true;
+    rgb = v;
+  }
+
+  bool get (int *result) const
+  {
+    if (is_rgb)
+      {
+	if (!get_color (rgb, result))
+	  return false;
+      }
+    else
+      *result = index;
+    return true;
+  }
+
+private:
+
+  bool is_rgb = false;
+  int index = -1;
+  rgb_color rgb;
+};
+
+/* Read a ";" and a number from STRING.  Return the number of
+   characters read and put the number into *NUM.  */
+
+static bool
+read_semi_number (const char *string, int *idx, long *num)
+{
+  if (string[*idx] != ';')
+    return false;
+  ++*idx;
+  if (string[*idx] < '0' || string[*idx] > '9')
+    return false;
+  char *tail;
+  *num = strtol (string + *idx, &tail, 10);
+  *idx = tail - string;
+  return true;
+}
+
+/* A helper for apply_ansi_escape that reads an extended color
+   sequence; that is, and 8- or 24- bit color.  */
+
+static bool
+extended_color (const char *str, int *idx, ansi_color *color)
+{
+  long value;
+
+  if (!read_semi_number (str, idx, &value))
+    return false;
+
+  if (value == 5)
+    {
+      /* 8-bit color.  */
+      if (!read_semi_number (str, idx, &value))
+	return false;
+
+      if (value >= 0 && value <= 7)
+	color->set (value);
+      else if (value >= 8 && value <= 15)
+	color->set (bright_colors[value - 8]);
+      else if (value >= 16 && value <= 231)
+	{
+	  value -= 16;
+	  short r = (value / 36) * 255 / 6;
+	  value %= 36;
+	  short g = (value / 6) * 255 / 6;
+	  value %= 6;
+	  short b = value * 255 / 6;
+	  rgb_color rgb = { r, g, b };
+	  color->set (rgb);
+	}
+      else if (value >= 232 && value <= 255)
+	{
+	  short v = (value - 232) * 255 / 24;
+	  rgb_color rgb { v, v, v };
+	  color->set (rgb);
+	}
+      else
+	return false;
+    }
+  else if (value == 2)
+    {
+      /* 24-bit color.  */
+      long r, g, b;
+      if (!read_semi_number (str, idx, &r)
+	  || r > 255
+	  || !read_semi_number (str, idx, &g)
+	  || g > 255
+	  || !read_semi_number (str, idx, &b)
+	  || b > 255)
+	return false;
+      rgb_color rgb = { short (r), short (g), short (b) };
+      color->set (rgb);
+    }
+  else
+    {
+      /* Unrecognized sequence.  */
+      return false;
+    }
+
+  return true;
+}
+
+/* A helper for apply_ansi_escape that resets the attributes.  */
+
+static void
+reset_attrs (WINDOW *w)
+{
+  wattron (w, A_NORMAL);
+  wattroff (w, A_BOLD);
+  wattroff (w, A_DIM);
+  wattroff (w, A_REVERSE);
+  if (last_color_pair != -1)
+    wattroff (w, COLOR_PAIR (last_color_pair));
+  wattron (w, COLOR_PAIR (0));
+}
+
+/* Apply an ANSI escape sequence from BUF to W.  BUF must start with
+   the ESC character.  If BUF does not start with an ANSI escape,
+   return 0.  Otherwise, apply the sequence if it is recognized, or
+   simply ignore it if not.  In this case, the number of bytes read
+   from BUF is returned.  */
+
+static size_t
+apply_ansi_escape (WINDOW *w, const char *buf)
+{
+  regmatch_t subexps[NUM_SUBEXPRESSIONS];
+
+  int match = regexec (&ansi_regex, buf, ARRAY_SIZE (subexps), subexps, 0);
+  if (match == REG_NOMATCH)
+    return 0;
+  /* Other failures mean the regexp is broken.  */
+  gdb_assert (match == 0);
+  /* The regexp is anchored.  */
+  gdb_assert (subexps[0].rm_so == 0);
+  /* The final character exists.  */
+  gdb_assert (subexps[FINAL_SUBEXP].rm_eo - subexps[FINAL_SUBEXP].rm_so == 1);
+
+  if (buf[subexps[FINAL_SUBEXP].rm_so] != 'm')
+    {
+      /* We don't handle this sequence, so just drop it.  */
+      return subexps[0].rm_eo;
+    }
+
+  /* Examine each setting in the match and apply it immediately.  See
+     the Select Graphic Rendition section of
+     https://en.wikipedia.org/wiki/ANSI_escape_code.  In essence each
+     code is just a number, separated by ";"; there are some more
+     wrinkles but we don't support them all..  */
+
+  ansi_color fgcolor, bgcolor;
+  bool color_set = false;
+
+  /* "\e[m" means the same thing as "\e[0m", so handle that specially
+     here.  */
+  if (subexps[DATA_SUBEXP].rm_so == subexps[DATA_SUBEXP].rm_eo)
+    reset_attrs (w);
+
+  for (regoff_t i = subexps[DATA_SUBEXP].rm_so;
+       i < subexps[DATA_SUBEXP].rm_eo;
+       ++i)
+    {
+      if (buf[i] == ';')
+	{
+	  /* Skip.  */
+	}
+      else if (buf[i] >= '0' && buf[i] <= '9')
+	{
+	  char *tail;
+	  long value = strtol (buf + i, &tail, 10);
+	  i = tail - buf;
+
+	  switch (value)
+	    {
+	    case 0:
+	      /* Reset.  */
+	      reset_attrs (w);
+	      color_set = false;
+	      fgcolor.reset ();
+	      bgcolor.reset ();
+	      break;
+	    case 1:
+	      /* Bold.  */
+	      wattron (w, A_BOLD);
+	      break;
+	    case 2:
+	      /* Dim.  */
+	      wattron (w, A_DIM);
+	      break;
+	    case 7:
+	      /* Reverse.  */
+	      wattron (w, A_REVERSE);
+	      break;
+	    case 21:
+	      wattroff (w, A_BOLD);
+	      break;
+	    case 22:
+	      /* Normal.  */
+	      wattron (w, A_NORMAL);
+	      break;
+	    case 27:
+	      /* Inverse off.  */
+	      wattroff (w, A_REVERSE);
+	      break;
+
+	    case 30:
+	    case 31:
+	    case 32:
+	    case 33:
+	    case 34:
+	    case 35:
+	    case 36:
+	    case 37:
+	      /* Note: not 38.  */
+	    case 39:
+	      fgcolor.set (curses_colors[value - 30]);
+	      color_set = true;
+	      break;
+
+	    case 40:
+	    case 41:
+	    case 42:
+	    case 43:
+	    case 44:
+	    case 45:
+	    case 46:
+	    case 47:
+	      /* Note: not 48.  */
+	    case 49:
+	      bgcolor.set (curses_colors[value - 40]);
+	      color_set = true;
+	      break;
+
+	    case 90:
+	    case 91:
+	    case 92:
+	    case 93:
+	    case 94:
+	    case 95:
+	    case 96:
+	    case 97:
+	      fgcolor.set (bright_colors[value - 90]);
+	      color_set = true;
+	      break;
+
+	    case 100:
+	    case 101:
+	    case 102:
+	    case 103:
+	    case 104:
+	    case 105:
+	    case 106:
+	    case 107:
+	      bgcolor.set (bright_colors[value - 100]);
+	      color_set = true;
+	      break;
+
+	    case 38:
+	      /* If we can't parse the extended color, fail.  */
+	      if (!extended_color (buf, &i, &fgcolor))
+		return subexps[0].rm_eo;
+	      color_set = true;
+	      break;
+
+	    case 48:
+	      /* If we can't parse the extended color, fail.  */
+	      if (!extended_color (buf, &i, &bgcolor))
+		return subexps[0].rm_eo;
+	      color_set = true;
+	      break;
+
+	    default:
+	      /* Ignore everything else.  */
+	      break;
+	    }
+	}
+      else
+	{
+	  /* Unknown, let's just ignore.  */
+	}
+    }
+
+  if (has_colors () && color_set)
+    {
+      int fgi, bgi;
+      if (fgcolor.get (&fgi) && bgcolor.get (&bgi))
+	{
+	  int pair = get_color_pair (fgi, bgi);
+	  if (last_color_pair != -1)
+	    wattroff (w, COLOR_PAIR (last_color_pair));
+	  wattron (w, COLOR_PAIR (pair));
+	  last_color_pair = pair;
+	}
+    }
+
+  return subexps[0].rm_eo;
+}
+
 /* Print LENGTH characters from the buffer pointed to by BUF to the
    curses command window.  The output is buffered.  It is up to the
    caller to refresh the screen if necessary.  */
@@ -195,10 +677,47 @@  tui_putc (char c)
 void
 tui_write (const char *buf, size_t length)
 {
+  /* We need this to be \0-terminated for the regexp matching.  */
+  std::string copy (buf, length);
+  tui_puts (copy.c_str ());
+}
+
+static void
+tui_puts_internal (const char *string, int *height)
+{
   WINDOW *w = TUI_CMD_WIN->generic.handle;
+  char c;
+  int prev_col = 0;
+
+  while ((c = *string++) != 0)
+    {
+      if (c == '\1' || c == '\2')
+	{
+	  /* Ignore these, they are readline escape-marking
+	     sequences.  */
+	}
+      else
+	{
+	  if (c == '\e')
+	    {
+	      size_t bytes_read = apply_ansi_escape (w, string - 1);
+	      if (bytes_read > 0)
+		{
+		  string = string + bytes_read - 1;
+		  continue;
+		}
+	    }
+	  do_tui_putc (w, c);
 
-  for (size_t i = 0; i < length; i++)
-    do_tui_putc (w, buf[i]);
+	  if (height != nullptr)
+	    {
+	      int col = getcurx (w);
+	      if (col <= prev_col)
+		++*height;
+	      prev_col = col;
+	    }
+	}
+    }
   update_cmdwin_start_line ();
 }
 
@@ -209,12 +728,7 @@  tui_write (const char *buf, size_t length)
 void
 tui_puts (const char *string)
 {
-  WINDOW *w = TUI_CMD_WIN->generic.handle;
-  char c;
-
-  while ((c = *string++) != 0)
-    do_tui_putc (w, c);
-  update_cmdwin_start_line ();
+  tui_puts_internal (string, nullptr);
 }
 
 /* Readline callback.
@@ -254,14 +768,10 @@  tui_redisplay_readline (void)
   wmove (w, start_line, 0);
   prev_col = 0;
   height = 1;
-  for (in = 0; prompt && prompt[in]; in++)
-    {
-      waddch (w, prompt[in]);
-      col = getcurx (w);
-      if (col <= prev_col)
-        height++;
-      prev_col = col;
-    }
+  if (prompt != nullptr)
+    tui_puts_internal (prompt, &height);
+
+  prev_col = getcurx (w);
   for (in = 0; in <= rl_end; in++)
     {
       unsigned char c;
@@ -522,6 +1032,10 @@  tui_setup_io (int mode)
 
       /* Save tty for SIGCONT.  */
       savetty ();
+
+      /* Clean up color information.  */
+      last_color_pair = -1;
+      color_pair_map.clear ();
     }
 }
 
@@ -726,3 +1240,12 @@  tui_expand_tabs (const char *string, int col)
 
   return ret;
 }
+
+void
+_initialize_tui_io ()
+{
+  int code = regcomp (&ansi_regex, ansi_regex_text, REG_EXTENDED);
+  /* If the regular expression was incorrect, it was a programming
+     error.  */
+  gdb_assert (code == 0);
+}
diff --git a/gdb/tui/tui.c b/gdb/tui/tui.c
index 75a9ced6190..d1191cceaed 100644
--- a/gdb/tui/tui.c
+++ b/gdb/tui/tui.c
@@ -437,6 +437,15 @@  tui_enable (void)
 		 gdb_getenv_term ());
 	}
       w = stdscr;
+      if (has_colors ())
+	{
+#ifdef HAVE_USE_DEFAULT_COLORS
+	  /* Ncurses extension to help with resetting to the default
+	     color.  */
+	  use_default_colors ();
+#endif
+	  start_color ();
+	}
 
       /* Check required terminal capabilities.  The MinGW port of
 	 ncurses does have them, but doesn't expose them through "cup".  */