diff mbox

[v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]

Message ID 80bb3d3a-bd89-2306-2772-85b5fdcb93c2@kobylkin.com
State New
Headers show

Commit Message

Egor Kobylkin Dec. 8, 2018, 10:28 p.m. UTC
Changelog v11:
* Re-targeted the patch against locale/C-translit.h.in as the proper
file for the ASCII translit table.
* Correspondingly the patch now only contains the additional
Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
The 'include "translit_cyrillic";""' directives are not necessary in the
locale files and they are now all left intact.
* Also the file translit_cyrillic is not longer needed and is omitted.
* Edited below email, commit message.

Changelog v10:
* Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
with diacritics) as conflicting with System B within glibc mechanics and
not solving BZ #2872
* Edited below email, commit message, comment in translit_cyrillic to
reflect System A removal
* Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
using composition) as composing is not covered by current glibc
conversion mechanics

Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch

Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).

Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
“copy "tr_TR"”.

Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
  to sequences of all uppercase Latin letters in all languages (whenever
  a Cyrillic letter is transliterated to more than one Latin letter),
  for example "Ї" is now transliterated as "YI" rather than "Yi".

Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration rows to locale/C-translit.h.in.

The patch is attached.


Current bug effect:

The glibc wiki explicitly lists this use case as the test example and
currently it fails on Cyrillic texts [1] [8] [9]:

iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

- it produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here.


COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The patch content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 System B official source (Federal Agency on
Technical Regulating and Metrology Of Russian Federation [2]).
Technically an independent but mostly identical source [3] was used and
prepared in a spreadsheet [6].

The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
System B represents what is actually called transcription (preserving
phonemes), while System A is the transliteration (preserving graphemes).
There is no meaningful way to preserve graphemes converting Cyrillic to
ASCII and thus the System B is chosen [11]. To be super clear the System
A has nothing to do with this bug regardless it being a transliteration.

Those interested in implementing System A for transliteration of
Cyrillic to Latin with Diacritic as a new feature are welcome to use the
spreadsheet in [6] as a starting point.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
[11]
https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3

Best regards,
Egor Kobylkin
diff mbox

Patch

From b9cd550028ecf7c875c9d7250c8598433b1fc474 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Sat, 8 Dec 2018 22:08:59 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]

	[BZ #2872]
	* locale/C-translit.h.in: Add Cyrillic transliteration.
---
 locale/C-translit.h.in | 170 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 170 insertions(+)

diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index e27f39e8fe..bd64edc609 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -2,6 +2,7 @@ 
    Copyright (C) 2000-2018 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Ulrich Drepper <drepper@redhat.com>, 2000.
+   0401-04f9 contributed by Egor Kobylkin <Egor@Kobylkin.com>, 2018.
 
    The GNU C Library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@@ -56,6 +57,175 @@ 
 "\x02cd"	"_"	/* <U02CD> MODIFIER LETTER LOW MACRON */
 "\x02d0"	":"	/* <U02D0> MODIFIER LETTER TRIANGULAR COLON */
 "\x02dc"	"~"	/* <U02DC> SMALL TILDE */
+"\x0401"	"YO"	/* <U0401> CYRILLIC CAPITAL LETTER IO */
+"\x0402"	"DJ"	/* <U0402> CYRILLIC CAPITAL LETTER DJE */
+"\x0403"	"G`"	/* <U0403> CYRILLIC CAPITAL LETTER GJE */
+"\x0404"	"YE"	/* <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE */
+"\x0405"	"Z`"	/* <U0405> CYRILLIC CAPITAL LETTER DZE */
+"\x0406"	"I"	/* <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0407"	"YI"	/* <U0407> CYRILLIC CAPITAL LETTER YI */
+"\x0408"	"J"	/* <U0408> CYRILLIC CAPITAL LETTER JE */
+"\x0409"	"L`"	/* <U0409> CYRILLIC CAPITAL LETTER LJE */
+"\x040a"	"N`"	/* <U040A> CYRILLIC CAPITAL LETTER NJE */
+"\x040b"	"TSH"	/* <U040B> CYRILLIC CAPITAL LETTER TSHE */
+"\x040c"	"K`"	/* <U040C> CYRILLIC CAPITAL LETTER KJE */
+"\x040e"	"U`"	/* <U040E> CYRILLIC CAPITAL LETTER SHORT U */
+"\x040f"	"DH"	/* <U040F> CYRILLIC CAPITAL LETTER DZHE */
+"\x0410"	"A"	/* <U0410> CYRILLIC CAPITAL LETTER A */
+"\x0411"	"B"	/* <U0411> CYRILLIC CAPITAL LETTER BE */
+"\x0412"	"V"	/* <U0412> CYRILLIC CAPITAL LETTER VE */
+"\x0413"	"G"	/* <U0413> CYRILLIC CAPITAL LETTER GHE */
+"\x0414"	"D"	/* <U0414> CYRILLIC CAPITAL LETTER DE */
+"\x0415"	"E"	/* <U0415> CYRILLIC CAPITAL LETTER IE */
+"\x0416"	"ZH"	/* <U0416> CYRILLIC CAPITAL LETTER ZHE */
+"\x0417"	"Z"	/* <U0417> CYRILLIC CAPITAL LETTER ZE */
+"\x0418"	"I"	/* <U0418> CYRILLIC CAPITAL LETTER I */
+"\x0419"	"J"	/* <U0419> CYRILLIC CAPITAL LETTER SHORT I */
+"\x041a"	"K"	/* <U041A> CYRILLIC CAPITAL LETTER KA */
+"\x041b"	"L"	/* <U041B> CYRILLIC CAPITAL LETTER EL */
+"\x041c"	"M"	/* <U041C> CYRILLIC CAPITAL LETTER EM */
+"\x041d"	"N"	/* <U041D> CYRILLIC CAPITAL LETTER EN */
+"\x041e"	"O"	/* <U041E> CYRILLIC CAPITAL LETTER O */
+"\x041f"	"P"	/* <U041F> CYRILLIC CAPITAL LETTER PE */
+"\x0420"	"R"	/* <U0420> CYRILLIC CAPITAL LETTER ER */
+"\x0421"	"S"	/* <U0421> CYRILLIC CAPITAL LETTER ES */
+"\x0422"	"T"	/* <U0422> CYRILLIC CAPITAL LETTER TE */
+"\x0423"	"U"	/* <U0423> CYRILLIC CAPITAL LETTER U */
+"\x0424"	"F"	/* <U0424> CYRILLIC CAPITAL LETTER EF */
+"\x0425"	"X"	/* <U0425> CYRILLIC CAPITAL LETTER HA */
+"\x0426"	"CZ"	/* <U0426> CYRILLIC CAPITAL LETTER TSE */
+"\x0427"	"CH"	/* <U0427> CYRILLIC CAPITAL LETTER CHE */
+"\x0428"	"SH"	/* <U0428> CYRILLIC CAPITAL LETTER SHA */
+"\x0429"	"SHH"	/* <U0429> CYRILLIC CAPITAL LETTER SHCHA */
+"\x042a"	"A`"	/* <U042A> CYRILLIC CAPITAL LETTER HARD SIGN */
+"\x042b"	"Y`"	/* <U042B> CYRILLIC CAPITAL LETTER YERU */
+"\x042c"	"`"	/* <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN */
+"\x042d"	"E`"	/* <U042D> CYRILLIC CAPITAL LETTER E */
+"\x042e"	"YU"	/* <U042E> CYRILLIC CAPITAL LETTER YU */
+"\x042f"	"YA"	/* <U042F> CYRILLIC CAPITAL LETTER YA */
+"\x0430"	"a"	/* <U0430> CYRILLIC SMALL LETTER A */
+"\x0431"	"b"	/* <U0431> CYRILLIC SMALL LETTER BE */
+"\x0432"	"v"	/* <U0432> CYRILLIC SMALL LETTER VE */
+"\x0433"	"g"	/* <U0433> CYRILLIC SMALL LETTER GHE */
+"\x0434"	"d"	/* <U0434> CYRILLIC SMALL LETTER DE */
+"\x0435"	"e"	/* <U0435> CYRILLIC SMALL LETTER IE */
+"\x0436"	"zh"	/* <U0436> CYRILLIC SMALL LETTER ZHE */
+"\x0437"	"z"	/* <U0437> CYRILLIC SMALL LETTER ZE */
+"\x0438"	"i"	/* <U0438> CYRILLIC SMALL LETTER I */
+"\x0439"	"j"	/* <U0439> CYRILLIC SMALL LETTER SHORT I */
+"\x043a"	"k"	/* <U043A> CYRILLIC SMALL LETTER KA */
+"\x043b"	"l"	/* <U043B> CYRILLIC SMALL LETTER EL */
+"\x043c"	"m"	/* <U043C> CYRILLIC SMALL LETTER EM */
+"\x043d"	"n"	/* <U043D> CYRILLIC SMALL LETTER EN */
+"\x043e"	"o"	/* <U043E> CYRILLIC SMALL LETTER O */
+"\x043f"	"p"	/* <U043F> CYRILLIC SMALL LETTER PE */
+"\x0440"	"r"	/* <U0440> CYRILLIC SMALL LETTER ER */
+"\x0441"	"s"	/* <U0441> CYRILLIC SMALL LETTER ES */
+"\x0442"	"t"	/* <U0442> CYRILLIC SMALL LETTER TE */
+"\x0443"	"u"	/* <U0443> CYRILLIC SMALL LETTER U */
+"\x0444"	"f"	/* <U0444> CYRILLIC SMALL LETTER EF */
+"\x0445"	"x"	/* <U0445> CYRILLIC SMALL LETTER HA */
+"\x0446"	"cz"	/* <U0446> CYRILLIC SMALL LETTER TSE */
+"\x0447"	"ch"	/* <U0447> CYRILLIC SMALL LETTER CHE */
+"\x0448"	"sh"	/* <U0448> CYRILLIC SMALL LETTER SHA */
+"\x0449"	"shh"	/* <U0449> CYRILLIC SMALL LETTER SHCHA */
+"\x044a"	"``"	/* <U044A> CYRILLIC SMALL LETTER HARD SIGN */
+"\x044b"	"y`"	/* <U044B> CYRILLIC SMALL LETTER YERU */
+"\x044c"	"`"	/* <U044C> CYRILLIC SMALL LETTER SOFT SIGN */
+"\x044d"	"e`"	/* <U044D> CYRILLIC SMALL LETTER E */
+"\x044e"	"yu"	/* <U044E> CYRILLIC SMALL LETTER YU */
+"\x044f"	"ya"	/* <U044F> CYRILLIC SMALL LETTER YA */
+"\x0451"	"yo"	/* <U0451> CYRILLIC SMALL LETTER IO */
+"\x0452"	"dj"	/* <U0452> CYRILLIC SMALL LETTER DJE */
+"\x0453"	"g`"	/* <U0453> CYRILLIC SMALL LETTER GJE */
+"\x0454"	"ye"	/* <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE */
+"\x0455"	"z`"	/* <U0455> CYRILLIC SMALL LETTER DZE */
+"\x0456"	"i"	/* <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0457"	"yi"	/* <U0457> CYRILLIC SMALL LETTER YI */
+"\x0458"	"j"	/* <U0458> CYRILLIC SMALL LETTER JE */
+"\x0459"	"l`"	/* <U0459> CYRILLIC SMALL LETTER LJE */
+"\x045a"	"n`"	/* <U045A> CYRILLIC SMALL LETTER NJE */
+"\x045b"	"tsh"	/* <U045B> CYRILLIC SMALL LETTER TSHE */
+"\x045c"	"k`"	/* <U045C> CYRILLIC SMALL LETTER KJE */
+"\x045e"	"u`"	/* <U045E> CYRILLIC SMALL LETTER SHORT U */
+"\x045f"	"dh"	/* <U045F> CYRILLIC SMALL LETTER DZHE */
+"\x046a"	"O`"	/* <U046A> CYRILLIC CAPITAL LETTER BIG YUS */
+"\x046b"	"o`"	/* <U046B> CYRILLIC SMALL LETTER BIG YUS */
+"\x0472"	"FH"	/* <U0472> CYRILLIC CAPITAL LETTER FITA */
+"\x0473"	"fh"	/* <U0473> CYRILLIC SMALL LETTER FITA */
+"\x0474"	"YH"	/* <U0474> CYRILLIC CAPITAL LETTER IZHITSA */
+"\x0475"	"yh"	/* <U0475> CYRILLIC SMALL LETTER IZHITSA */
+"\x048c"	"E`"	/* <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN */
+"\x048d"	"e`"	/* <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN */
+"\x0490"	"G`"	/* <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN */
+"\x0491"	"g`"	/* <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN */
+"\x0492"	"GH"	/* <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE */
+"\x0493"	"gh"	/* <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE */
+"\x0494"	"GH"	/* <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK */
+"\x0495"	"gh"	/* <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK */
+"\x0496"	"ZH`"	/* <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER */
+"\x0497"	"zh`"	/* <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER */
+"\x049a"	"K`"	/* <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER */
+"\x049b"	"k`"	/* <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER */
+"\x049e"	"K`"	/* <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE */
+"\x049f"	"k`"	/* <U049F> CYRILLIC SMALL LETTER KA WITH STROKE */
+"\x04a2"	"N`"	/* <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER */
+"\x04a3"	"n`"	/* <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER */
+"\x04a4"	"NG"	/* <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE */
+"\x04a5"	"ng"	/* <U04A5> CYRILLIC SMALL LIGATURE EN GHE */
+"\x04a6"	"P`"	/* <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK */
+"\x04a7"	"p`"	/* <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK */
+"\x04a8"	"O`"	/* <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA */
+"\x04a9"	"o`"	/* <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA */
+"\x04aa"	"C`"	/* <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER */
+"\x04ab"	"C`"	/* <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER */
+"\x04ac"	"T`"	/* <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER */
+"\x04ad"	"t`"	/* <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER */
+"\x04ae"	"U"	/* <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U */
+"\x04af"	"u"	/* <U04AF> CYRILLIC SMALL LETTER STRAIGHT U */
+"\x04b2"	"H`"	/* <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER */
+"\x04b3"	"h`"	/* <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER */
+"\x04b4"	"TCZ"	/* <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE */
+"\x04b5"	"tcz"	/* <U04B5> CYRILLIC SMALL LIGATURE TE TSE */
+"\x04ba"	"SH`"	/* <U04BA> CYRILLIC CAPITAL LETTER SHHA */
+"\x04bb"	"SH`"	/* <U04BB> CYRILLIC SMALL LETTER SHHA */
+"\x04bc"	"CH`"	/* <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE */
+"\x04bd"	"ch`"	/* <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE */
+"\x04be"	"CH`"	/* <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04bf"	"ch`"	/* <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04c0"	"i"	/* <U04C0> CYRILLIC LETTER PALOCHKA */
+"\x04c1"	"ZH`"	/* <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE */
+"\x04c2"	"zh`"	/* <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE */
+"\x04cb"	"CH`"	/* <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE */
+"\x04cc"	"ch`"	/* <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE */
+"\x04d0"	"A`"	/* <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE */
+"\x04d1"	"a`"	/* <U04D1> CYRILLIC SMALL LETTER A WITH BREVE */
+"\x04d2"	"A`"	/* <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS */
+"\x04d3"	"a`"	/* <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS */
+"\x04d6"	"E`"	/* <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE */
+"\x04d7"	"e`"	/* <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE */
+"\x04d8"	"A`"	/* <U04D8> CYRILLIC CAPITAL LETTER SCHWA */
+"\x04d9"	"a`"	/* <U04D9> CYRILLIC SMALL LETTER SCHWA */
+"\x04dc"	"ZH`"	/* <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS */
+"\x04dd"	"zh`"	/* <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS */
+"\x04de"	"Z`"	/* <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS */
+"\x04df"	"z`"	/* <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS */
+"\x04e0"	"Z`"	/* <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE */
+"\x04e1"	"z`"	/* <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE */
+"\x04e4"	"I`"	/* <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS */
+"\x04e5"	"i`"	/* <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS */
+"\x04e6"	"O`"	/* <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS */
+"\x04e7"	"o`"	/* <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS */
+"\x04e8"	"O`"	/* <U04E8> CYRILLIC CAPITAL LETTER BARRED O */
+"\x04e9"	"o`"	/* <U04E9> CYRILLIC SMALL LETTER BARRED O */
+"\x04f0"	"U`"	/* <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS */
+"\x04f1"	"u`"	/* <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS */
+"\x04f2"	"U`"	/* <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE */
+"\x04f3"	"u`"	/* <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE */
+"\x04f4"	"CH`"	/* <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS */
+"\x04f5"	"ch`"	/* <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS */
+"\x04f8"	"Y`"	/* <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS */
+"\x04f9"	"y`"	/* <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS */
 "\x2002"	" "	/* <U2002> EN SPACE */
 "\x2003"	" "	/* <U2003> EM SPACE */
 "\x2004"	" "	/* <U2004> THREE-PER-EM SPACE */
-- 
2.17.1