cannot skip augment string handling

Message ID CAP3s5k8EfHWcne4YG=rEThAqhi=7DAkM3VC7hWawkSohnJoAZg@mail.gmail.com
State Committed
Headers
Series cannot skip augment string handling |

Commit Message

Ulrich Drepper Aug. 9, 2022, 6:01 p.m. UTC
  He dwarf_next_cfi function has some clever code which skips over the
processing of the augmentation string content if the first character is 'z'
(for sized augmentation).  This would be OK if it wouldn't be for the fact
that the augment processing loop produces additional information, namely,
it fills in the fde_augmentation_data_size fields.  That information isn't
available elsewhere.

In addition, the loop over the augment string is incorrect because the
interpretation of the P, L, and R entries depends on 'z' being present.  in
the absence of 'z', when the loop would be executed in the current version,
the interpretation of those entries is not the same.

In the patch below I've removed the shortcut and fixed the handling of the
P, L, and R entries.  I've also added an additional test checking that the
entries of the augmentation string don't guide the code to consume more
data then is indicated in the 'z' data.

libdw/ChangeLog
2022-08-09  Ulrich Drepper  <drepper@redhat.com>

* dwarf_next_cfi.c (dwarf_next_cfi): Don't skip processing the
augmentation string.  Be more stringent what to accept.
  

Comments

Mark Wielaard Aug. 13, 2022, 9:28 p.m. UTC | #1
Hi Ulrich,

On Tue, Aug 09, 2022 at 08:01:43PM +0200, Ulrich Drepper via Elfutils-devel wrote:
> He dwarf_next_cfi function has some clever code which skips over the
> processing of the augmentation string content if the first character is 'z'
> (for sized augmentation).  This would be OK if it wouldn't be for the fact
> that the augment processing loop produces additional information, namely,
> it fills in the fde_augmentation_data_size fields.  That information isn't
> available elsewhere.
> 
> In addition, the loop over the augment string is incorrect because the
> interpretation of the P, L, and R entries depends on 'z' being present.  in
> the absence of 'z', when the loop would be executed in the current version,
> the interpretation of those entries is not the same.
> 
> In the patch below I've removed the shortcut and fixed the handling of the
> P, L, and R entries.  I've also added an additional test checking that the
> entries of the augmentation string don't guide the code to consume more
> data then is indicated in the 'z' data.

Looks good. Thanks for catching this. Please do add a Signed-off-by
line next time. See the CONTRIBUTING file.

I was wondering why this hasn't caused an issue before. But it looks
like internally when we use the result of dwarf_next_cfi in cie.c and
fde.c we always call __libdw_intern_cie or intern_new_cie which
recalculates the fde_augmentation_data_size by reading the
augmentation string and data again.

Thanks,

Mark
  

Patch

diff --git a/libdw/dwarf_next_cfi.c b/libdw/dwarf_next_cfi.c
index fa28d99b..23b16885 100644
--- a/libdw/dwarf_next_cfi.c
+++ b/libdw/dwarf_next_cfi.c
@@ -193,50 +193,71 @@  dwarf_next_cfi (const unsigned char e_ident[],
       else			/* DWARF 2 */
 	entry->cie.return_address_register = *bytes++;
 
-      /* If we have sized augmentation data,
-	 we don't need to grok it all.  */
       entry->cie.fde_augmentation_data_size = 0;
+      entry->cie.augmentation_data = bytes;
       bool sized_augmentation = *ap == 'z';
       if (sized_augmentation)
 	{
+	  ++ap;
 	  if (bytes >= limit)
 	    goto invalid;
 	  get_uleb128 (entry->cie.augmentation_data_size, bytes, limit);
 	  if ((Dwarf_Word) (limit - bytes) < entry->cie.augmentation_data_size)
 	    goto invalid;
 	  entry->cie.augmentation_data = bytes;
-	  bytes += entry->cie.augmentation_data_size;
 	}
-      else
-	{
-	  entry->cie.augmentation_data = bytes;
 
-	  for (; *ap != '\0'; ++ap)
+      for (; *ap != '\0'; ++ap)
+	{
+	  uint8_t encoding;
+	  switch (*ap)
 	    {
-	      uint8_t encoding;
-	      switch (*ap)
+	    case 'L':
+	      if (sized_augmentation)
 		{
-		case 'L':		/* Skip LSDA pointer encoding byte.  */
-		case 'R':		/* Skip FDE address encoding byte.  */
+		  /* Skip LSDA pointer encoding byte.  */
 		  encoding = *bytes++;
 		  entry->cie.fde_augmentation_data_size
 		    += encoded_value_size (data, e_ident, encoding, NULL);
 		  continue;
-		case 'P':   /* Skip encoded personality routine pointer. */
+		}
+	      break;
+	    case 'R':
+	      if (sized_augmentation)
+		{
+		  /* Skip FDE address encoding byte.  */
 		  encoding = *bytes++;
-		  bytes += encoded_value_size (data, e_ident, encoding, bytes);
 		  continue;
-		case 'S':		/* Skip signal-frame flag.  */
+		}
+	      break;
+	    case 'P':
+	      if (sized_augmentation)
+		{
+		  /* Skip encoded personality routine pointer. */
+		  encoding = *bytes++;
+		  bytes += encoded_value_size (data, e_ident, encoding, bytes);
 		  continue;
-		default:
-		  /* Unknown augmentation string.  initial_instructions might
-		     actually start with some augmentation data.  */
-		  break;
 		}
 	      break;
+	    case 'S':
+	      if (sized_augmentation)
+		/* Skip signal-frame flag.  */
+		continue;
+	      break;
+	    default:
+	      /* Unknown augmentation string.  initial_instructions might
+		 actually start with some augmentation data.  */
+	      break;
 	    }
-	  entry->cie.augmentation_data_size
-	    = bytes - entry->cie.augmentation_data;
+	  break;
+	}
+      if (! sized_augmentation)
+	entry->cie.augmentation_data_size = bytes - entry->cie.augmentation_data;
+      else
+	{
+	  if (bytes > entry->cie.augmentation_data + entry->cie.augmentation_data_size)
+	    goto invalid;
+	  bytes = entry->cie.augmentation_data + entry->cie.augmentation_data_size;
 	}
 
       entry->cie.initial_instructions = bytes;