[v18,2/3] locale: charmap: fix off-by-one with ranges

Message ID 946054e4f257b2273103c021f49daf7187919a0a.1690133538.git.nabijaczleweli@nabijaczleweli.xyz
State New
Headers
Series [v18,1/3] iconv: __gconv_btwoc_ascii -> __gconv_btowc_ascii |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Testing passed

Commit Message

Ahelenia Ziemiańska July 23, 2023, 5:33 p.m. UTC
  The "current character" bytes array was incremented at the end of the
loop instead of at the beginning, which meant that for ASCII +
  <UDC80>..<UDCFF>  /x80
it would complain about overrunning 0xFF->0x100 when in reality the loop
would've ended just after.

Instead, bump the current character at the start of the loop
(but not the first time, of course),
precisely as many times as there are characters in the range.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
New patch, trivial and obvious off-by-1.

 locale/programs/charmap.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)
  

Comments

Bruno Haible July 23, 2023, 7:38 p.m. UTC | #1
наб wrote:
> Instead, bump the current character at the start of the loop
> (but not the first time, of course),
> precisely as many times as there are characters in the range.

Looks good to me. Alternatively, one could leave that piece of code
at the end of the loop and add before it these two lines:

       if (cnt == to_nr)
         break;

I hope no opinion regarding which of the two is "better" code style.

Bruno
  

Patch

diff --git a/locale/programs/charmap.c b/locale/programs/charmap.c
index e4847aa3a0..822239ef11 100644
--- a/locale/programs/charmap.c
+++ b/locale/programs/charmap.c
@@ -1037,6 +1037,20 @@  hexadecimal range format should use only capital characters"));
 
   for (cnt = from_nr; cnt <= to_nr; cnt += step)
     {
+      /* Increment the value in the byte sequence.  */
+      if (cnt != from_nr && ++bytes[nbytes - 1] == '\0')
+	{
+	  int b = nbytes - 2;
+	  do
+	    if (b < 0)
+	      {
+		lr_error (lr,
+			  _("resulting bytes for range not representable."));
+		return;
+	      }
+	  while (++bytes[b--] == 0);
+	}
+
       char *name_end;
       obstack_printf (ob, decimal_ellipsis ? "%.*s%0*d" : "%.*s%0*X",
 		      prefix_len, from, len1 - prefix_len, cnt);
@@ -1079,21 +1093,6 @@  hexadecimal range format should use only capital characters"));
       insert_entry (bt, newp->bytes, nbytes, newp);
       /* Please note we don't examine the return value since it is no error
 	 if we have two definitions for a symbol.  */
-
-      /* Increment the value in the byte sequence.  */
-      if (++bytes[nbytes - 1] == '\0')
-	{
-	  int b = nbytes - 2;
-
-	  do
-	    if (b < 0)
-	      {
-		lr_error (lr,
-			  _("resulting bytes for range not representable."));
-		return;
-	      }
-	  while (++bytes[b--] == 0);
-	}
     }
 }