From patchwork Wed Dec 19 23:16:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Egor Kobylkin X-Patchwork-Id: 30766 Received: (qmail 68064 invoked by alias); 19 Dec 2018 23:16:53 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 68040 invoked by uid 89); 19 Dec 2018 23:16:52 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-23.2 required=5.0 tests=BAYES_50, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 spammy=Federal, agency, speaker, dje X-HELO: mout.kundenserver.de Subject: Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872] From: Egor Kobylkin To: libc-alpha@sourceware.org, libc-locales@sourceware.org, Marko Myllynen , Carlos O'Donell References: <41532e13-a63d-5df1-ab37-05eb4d6c8d0a@kobylkin.com> <20180412224352.GB2911@altlinux.org> <80bb3d3a-bd89-2306-2772-85b5fdcb93c2@kobylkin.com> Openpgp: preference=signencrypt Message-ID: <01570554-64e4-8159-eeb3-1aff9f76b1f4@kobylkin.com> Date: Thu, 20 Dec 2018 00:16:37 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <80bb3d3a-bd89-2306-2772-85b5fdcb93c2@kobylkin.com> Freeze ping. I'd like to ping the list on this patch and to have some discussion on moving ASCII transliteration to locale/C-translit.h.in before the freeze. The wiki page for 2.29 [12] is set as "immutable" for newly registered users, not sure it is so desired. I could not add this patch there as "desired". I have added 2.29 keyword to the bug entry. Bests, Egor Kobylkin [12] https://sourceware.org/glibc/wiki/Release/2.29 On 08.12.18 23:28, Egor Kobylkin wrote: > Changelog v11: > * Re-targeted the patch against locale/C-translit.h.in as the proper > file for the ASCII translit table. > * Correspondingly the patch now only contains the additional > Cyrillic-ASCII strings in the format of locale/C-translit.h.in table. > The 'include "translit_cyrillic";""' directives are not necessary in the > locale files and they are now all left intact. > * Also the file translit_cyrillic is not longer needed and is omitted. > * Edited below email, commit message. > > Changelog v10: > * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin > with diacritics) as conflicting with System B within glibc mechanics and > not solving BZ #2872 > * Edited below email, commit message, comment in translit_cyrillic to > reflect System A removal > * Removed and (Cyrillic U with acute, > using composition) as composing is not covered by current glibc > conversion mechanics > > Changelog v9: > * Fixed formatting (trailing spaces etc.) > * Put commit summary in the patch file, now it is generated completely > by git format-patch > > Changelog v8: > * Re-added missing translit_cyrillic in patch v7 (due to missing "git > add" in the script). > > Changelog v7: > * Generated against git://sourceware.org/git/glibc.git master with git > format-patch. > * The 'include "translit_cyrillic";""' now immediately follows last > 'include "translit_XXX";""' string (was inserted just before > translit_end previously.) > * Only the locales already having 'include .*translit.*;""' are patched > (see the list for manual exclusions below, full list of included locales > at the end of the email in the commit section.) > * Excluded az_AZ completely to avoid circular reference from tr_TR via > “copy "tr_TR"”. > > Changelog v6: > * Locales removed from the patch: C and sd_PK. > * Added locales: az_AZ and ky_KG. > * Consistently transliterate single uppercase Cyrillic letters > to sequences of all uppercase Latin letters in all languages (whenever > a Cyrillic letter is transliterated to more than one Latin letter), > for example "Ї" is now transliterated as "YI" rather than "Yi". > > Dear locale maintainers, > > fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails" > > https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1] > > add the Cyrillic transliteration rows to locale/C-translit.h.in. > > The patch is attached. > > > Current bug effect: > > The glibc wiki explicitly lists this use case as the test example and > currently it fails on Cyrillic texts [1] [8] [9]: > > iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC > > CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???. > > - it produces a string of question marks and spaces. > > This is what it should produce and it does so after the patch applied: > > CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe > chayu. > > > The root problem and the fix: > > The root problem is the missing transliteration table that I am > supplying here. > > > COMMIT MESSAGE: > This translit_cyrillic table enables conversion (e.g. with iconv) from a > UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text. > > Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII > compatible transcription. > > While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of > a transliteration/transcription has only Latin/ASCII codes but still can > be read by a native speaker. Among other things it is useful for > processing the Cyrillic texts and filenames by programs or on systems > that are not specifically prepared to work with Cyrillic, don't have > corresponding fonts installed or can't handle UTF-8. > > The patch content (mapping) is based on ISO 9.1995 standard [10] and its > derivative GOST 7.79-2000 System B official source (Federal Agency on > Technical Regulating and Metrology Of Russian Federation [2]). > Technically an independent but mostly identical source [3] was used and > prepared in a spreadsheet [6]. > > The transliteration of Cyrillic to ASCII according to GOST 7.79-2000 > System B represents what is actually called transcription (preserving > phonemes), while System A is the transliteration (preserving graphemes). > There is no meaningful way to preserve graphemes converting Cyrillic to > ASCII and thus the System B is chosen [11]. To be super clear the System > A has nothing to do with this bug regardless it being a transliteration. > > Those interested in implementing System A for transliteration of > Cyrillic to Latin with Diacritic as a new feature are welcome to use the > spreadsheet in [6] as a starting point. > > Links: > > [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872 > [2] GOST 7.79-2000 official source > http://protect.gost.ru/document.aspx?control=7&id=130715 (is only > available in low quality gif format) > [3] http://transliteration.ru/gost-7-79-2000/ and > http://www.yfermer.ru/specifications/285821.html > [4] Wikipedia article on Cyrillic transliteration with Latin alphabet > https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9 > [5] http://man7.org/linux/man-pages/man5/locale.5.html > [6] Spreadsheet for generating translit_cyrillic > https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1 > [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales > [9] translit-test-input.txt > https://sourceware.org/bugzilla/attachment.cgi?id=11304 > [10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B > [11] > https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3 > > Best regards, > Egor Kobylkin > > From b9cd550028ecf7c875c9d7250c8598433b1fc474 Mon Sep 17 00:00:00 2001 From: Egor Kobylkin Date: Sat, 8 Dec 2018 22:08:59 +0100 Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] [BZ #2872] * locale/C-translit.h.in: Add Cyrillic transliteration. --- locale/C-translit.h.in | 170 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 170 insertions(+) diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in index e27f39e8fe..bd64edc609 100644 --- a/locale/C-translit.h.in +++ b/locale/C-translit.h.in @@ -2,6 +2,7 @@ Copyright (C) 2000-2018 Free Software Foundation, Inc. This file is part of the GNU C Library. Contributed by Ulrich Drepper , 2000. + 0401-04f9 contributed by Egor Kobylkin , 2018. The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -56,6 +57,175 @@ "\x02cd" "_" /* MODIFIER LETTER LOW MACRON */ "\x02d0" ":" /* MODIFIER LETTER TRIANGULAR COLON */ "\x02dc" "~" /* SMALL TILDE */ +"\x0401" "YO" /* CYRILLIC CAPITAL LETTER IO */ +"\x0402" "DJ" /* CYRILLIC CAPITAL LETTER DJE */ +"\x0403" "G`" /* CYRILLIC CAPITAL LETTER GJE */ +"\x0404" "YE" /* CYRILLIC CAPITAL LETTER UKRAINIAN IE */ +"\x0405" "Z`" /* CYRILLIC CAPITAL LETTER DZE */ +"\x0406" "I" /* CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I */ +"\x0407" "YI" /* CYRILLIC CAPITAL LETTER YI */ +"\x0408" "J" /* CYRILLIC CAPITAL LETTER JE */ +"\x0409" "L`" /* CYRILLIC CAPITAL LETTER LJE */ +"\x040a" "N`" /* CYRILLIC CAPITAL LETTER NJE */ +"\x040b" "TSH" /* CYRILLIC CAPITAL LETTER TSHE */ +"\x040c" "K`" /* CYRILLIC CAPITAL LETTER KJE */ +"\x040e" "U`" /* CYRILLIC CAPITAL LETTER SHORT U */ +"\x040f" "DH" /* CYRILLIC CAPITAL LETTER DZHE */ +"\x0410" "A" /* CYRILLIC CAPITAL LETTER A */ +"\x0411" "B" /* CYRILLIC CAPITAL LETTER BE */ +"\x0412" "V" /* CYRILLIC CAPITAL LETTER VE */ +"\x0413" "G" /* CYRILLIC CAPITAL LETTER GHE */ +"\x0414" "D" /* CYRILLIC CAPITAL LETTER DE */ +"\x0415" "E" /* CYRILLIC CAPITAL LETTER IE */ +"\x0416" "ZH" /* CYRILLIC CAPITAL LETTER ZHE */ +"\x0417" "Z" /* CYRILLIC CAPITAL LETTER ZE */ +"\x0418" "I" /* CYRILLIC CAPITAL LETTER I */ +"\x0419" "J" /* CYRILLIC CAPITAL LETTER SHORT I */ +"\x041a" "K" /* CYRILLIC CAPITAL LETTER KA */ +"\x041b" "L" /* CYRILLIC CAPITAL LETTER EL */ +"\x041c" "M" /* CYRILLIC CAPITAL LETTER EM */ +"\x041d" "N" /* CYRILLIC CAPITAL LETTER EN */ +"\x041e" "O" /* CYRILLIC CAPITAL LETTER O */ +"\x041f" "P" /* CYRILLIC CAPITAL LETTER PE */ +"\x0420" "R" /* CYRILLIC CAPITAL LETTER ER */ +"\x0421" "S" /* CYRILLIC CAPITAL LETTER ES */ +"\x0422" "T" /* CYRILLIC CAPITAL LETTER TE */ +"\x0423" "U" /* CYRILLIC CAPITAL LETTER U */ +"\x0424" "F" /* CYRILLIC CAPITAL LETTER EF */ +"\x0425" "X" /* CYRILLIC CAPITAL LETTER HA */ +"\x0426" "CZ" /* CYRILLIC CAPITAL LETTER TSE */ +"\x0427" "CH" /* CYRILLIC CAPITAL LETTER CHE */ +"\x0428" "SH" /* CYRILLIC CAPITAL LETTER SHA */ +"\x0429" "SHH" /* CYRILLIC CAPITAL LETTER SHCHA */ +"\x042a" "A`" /* CYRILLIC CAPITAL LETTER HARD SIGN */ +"\x042b" "Y`" /* CYRILLIC CAPITAL LETTER YERU */ +"\x042c" "`" /* CYRILLIC CAPITAL LETTER SOFT SIGN */ +"\x042d" "E`" /* CYRILLIC CAPITAL LETTER E */ +"\x042e" "YU" /* CYRILLIC CAPITAL LETTER YU */ +"\x042f" "YA" /* CYRILLIC CAPITAL LETTER YA */ +"\x0430" "a" /* CYRILLIC SMALL LETTER A */ +"\x0431" "b" /* CYRILLIC SMALL LETTER BE */ +"\x0432" "v" /* CYRILLIC SMALL LETTER VE */ +"\x0433" "g" /* CYRILLIC SMALL LETTER GHE */ +"\x0434" "d" /* CYRILLIC SMALL LETTER DE */ +"\x0435" "e" /* CYRILLIC SMALL LETTER IE */ +"\x0436" "zh" /* CYRILLIC SMALL LETTER ZHE */ +"\x0437" "z" /* CYRILLIC SMALL LETTER ZE */ +"\x0438" "i" /* CYRILLIC SMALL LETTER I */ +"\x0439" "j" /* CYRILLIC SMALL LETTER SHORT I */ +"\x043a" "k" /* CYRILLIC SMALL LETTER KA */ +"\x043b" "l" /* CYRILLIC SMALL LETTER EL */ +"\x043c" "m" /* CYRILLIC SMALL LETTER EM */ +"\x043d" "n" /* CYRILLIC SMALL LETTER EN */ +"\x043e" "o" /* CYRILLIC SMALL LETTER O */ +"\x043f" "p" /* CYRILLIC SMALL LETTER PE */ +"\x0440" "r" /* CYRILLIC SMALL LETTER ER */ +"\x0441" "s" /* CYRILLIC SMALL LETTER ES */ +"\x0442" "t" /* CYRILLIC SMALL LETTER TE */ +"\x0443" "u" /* CYRILLIC SMALL LETTER U */ +"\x0444" "f" /* CYRILLIC SMALL LETTER EF */ +"\x0445" "x" /* CYRILLIC SMALL LETTER HA */ +"\x0446" "cz" /* CYRILLIC SMALL LETTER TSE */ +"\x0447" "ch" /* CYRILLIC SMALL LETTER CHE */ +"\x0448" "sh" /* CYRILLIC SMALL LETTER SHA */ +"\x0449" "shh" /* CYRILLIC SMALL LETTER SHCHA */ +"\x044a" "``" /* CYRILLIC SMALL LETTER HARD SIGN */ +"\x044b" "y`" /* CYRILLIC SMALL LETTER YERU */ +"\x044c" "`" /* CYRILLIC SMALL LETTER SOFT SIGN */ +"\x044d" "e`" /* CYRILLIC SMALL LETTER E */ +"\x044e" "yu" /* CYRILLIC SMALL LETTER YU */ +"\x044f" "ya" /* CYRILLIC SMALL LETTER YA */ +"\x0451" "yo" /* CYRILLIC SMALL LETTER IO */ +"\x0452" "dj" /* CYRILLIC SMALL LETTER DJE */ +"\x0453" "g`" /* CYRILLIC SMALL LETTER GJE */ +"\x0454" "ye" /* CYRILLIC SMALL LETTER UKRAINIAN IE */ +"\x0455" "z`" /* CYRILLIC SMALL LETTER DZE */ +"\x0456" "i" /* CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I */ +"\x0457" "yi" /* CYRILLIC SMALL LETTER YI */ +"\x0458" "j" /* CYRILLIC SMALL LETTER JE */ +"\x0459" "l`" /* CYRILLIC SMALL LETTER LJE */ +"\x045a" "n`" /* CYRILLIC SMALL LETTER NJE */ +"\x045b" "tsh" /* CYRILLIC SMALL LETTER TSHE */ +"\x045c" "k`" /* CYRILLIC SMALL LETTER KJE */ +"\x045e" "u`" /* CYRILLIC SMALL LETTER SHORT U */ +"\x045f" "dh" /* CYRILLIC SMALL LETTER DZHE */ +"\x046a" "O`" /* CYRILLIC CAPITAL LETTER BIG YUS */ +"\x046b" "o`" /* CYRILLIC SMALL LETTER BIG YUS */ +"\x0472" "FH" /* CYRILLIC CAPITAL LETTER FITA */ +"\x0473" "fh" /* CYRILLIC SMALL LETTER FITA */ +"\x0474" "YH" /* CYRILLIC CAPITAL LETTER IZHITSA */ +"\x0475" "yh" /* CYRILLIC SMALL LETTER IZHITSA */ +"\x048c" "E`" /* CYRILLIC CAPITAL LETTER SEMISOFT SIGN */ +"\x048d" "e`" /* CYRILLIC SMALL LETTER SEMISOFT SIGN */ +"\x0490" "G`" /* CYRILLIC CAPITAL LETTER GHE WITH UPTURN */ +"\x0491" "g`" /* CYRILLIC SMALL LETTER GHE WITH UPTURN */ +"\x0492" "GH" /* CYRILLIC CAPITAL LETTER GHE WITH STROKE */ +"\x0493" "gh" /* CYRILLIC SMALL LETTER GHE WITH STROKE */ +"\x0494" "GH" /* CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK */ +"\x0495" "gh" /* CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK */ +"\x0496" "ZH`" /* CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER */ +"\x0497" "zh`" /* CYRILLIC SMALL LETTER ZHE WITH DESCENDER */ +"\x049a" "K`" /* CYRILLIC CAPITAL LETTER KA WITH DESCENDER */ +"\x049b" "k`" /* CYRILLIC SMALL LETTER KA WITH DESCENDER */ +"\x049e" "K`" /* CYRILLIC CAPITAL LETTER KA WITH STROKE */ +"\x049f" "k`" /* CYRILLIC SMALL LETTER KA WITH STROKE */ +"\x04a2" "N`" /* CYRILLIC CAPITAL LETTER EN WITH DESCENDER */ +"\x04a3" "n`" /* CYRILLIC SMALL LETTER EN WITH DESCENDER */ +"\x04a4" "NG" /* CYRILLIC CAPITAL LIGATURE EN GHE */ +"\x04a5" "ng" /* CYRILLIC SMALL LIGATURE EN GHE */ +"\x04a6" "P`" /* CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK */ +"\x04a7" "p`" /* CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK */ +"\x04a8" "O`" /* CYRILLIC CAPITAL LETTER ABKHASIAN HA */ +"\x04a9" "o`" /* CYRILLIC SMALL LETTER ABKHASIAN HA */ +"\x04aa" "C`" /* CYRILLIC CAPITAL LETTER ES WITH DESCENDER */ +"\x04ab" "C`" /* CYRILLIC SMALL LETTER ES WITH DESCENDER */ +"\x04ac" "T`" /* CYRILLIC CAPITAL LETTER TE WITH DESCENDER */ +"\x04ad" "t`" /* CYRILLIC SMALL LETTER TE WITH DESCENDER */ +"\x04ae" "U" /* CYRILLIC CAPITAL LETTER STRAIGHT U */ +"\x04af" "u" /* CYRILLIC SMALL LETTER STRAIGHT U */ +"\x04b2" "H`" /* CYRILLIC CAPITAL LETTER HA WITH DESCENDER */ +"\x04b3" "h`" /* CYRILLIC SMALL LETTER HA WITH DESCENDER */ +"\x04b4" "TCZ" /* CYRILLIC CAPITAL LIGATURE TE TSE */ +"\x04b5" "tcz" /* CYRILLIC SMALL LIGATURE TE TSE */ +"\x04ba" "SH`" /* CYRILLIC CAPITAL LETTER SHHA */ +"\x04bb" "SH`" /* CYRILLIC SMALL LETTER SHHA */ +"\x04bc" "CH`" /* CYRILLIC CAPITAL LETTER ABKHASIAN CHE */ +"\x04bd" "ch`" /* CYRILLIC SMALL LETTER ABKHASIAN CHE */ +"\x04be" "CH`" /* CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER */ +"\x04bf" "ch`" /* CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER */ +"\x04c0" "i" /* CYRILLIC LETTER PALOCHKA */ +"\x04c1" "ZH`" /* CYRILLIC CAPITAL LETTER ZHE WITH BREVE */ +"\x04c2" "zh`" /* CYRILLIC SMALL LETTER ZHE WITH BREVE */ +"\x04cb" "CH`" /* CYRILLIC CAPITAL LETTER KHAKASSIAN CHE */ +"\x04cc" "ch`" /* CYRILLIC SMALL LETTER KHAKASSIAN CHE */ +"\x04d0" "A`" /* CYRILLIC CAPITAL LETTER A WITH BREVE */ +"\x04d1" "a`" /* CYRILLIC SMALL LETTER A WITH BREVE */ +"\x04d2" "A`" /* CYRILLIC CAPITAL LETTER A WITH DIAERESIS */ +"\x04d3" "a`" /* CYRILLIC SMALL LETTER A WITH DIAERESIS */ +"\x04d6" "E`" /* CYRILLIC CAPITAL LETTER IE WITH BREVE */ +"\x04d7" "e`" /* CYRILLIC SMALL LETTER IE WITH BREVE */ +"\x04d8" "A`" /* CYRILLIC CAPITAL LETTER SCHWA */ +"\x04d9" "a`" /* CYRILLIC SMALL LETTER SCHWA */ +"\x04dc" "ZH`" /* CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS */ +"\x04dd" "zh`" /* CYRILLIC SMALL LETTER ZHE WITH DIAERESIS */ +"\x04de" "Z`" /* CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS */ +"\x04df" "z`" /* CYRILLIC SMALL LETTER ZE WITH DIAERESIS */ +"\x04e0" "Z`" /* CYRILLIC CAPITAL LETTER ABKHASIAN DZE */ +"\x04e1" "z`" /* CYRILLIC SMALL LETTER ABKHASIAN DZE */ +"\x04e4" "I`" /* CYRILLIC CAPITAL LETTER I WITH DIAERESIS */ +"\x04e5" "i`" /* CYRILLIC SMALL LETTER I WITH DIAERESIS */ +"\x04e6" "O`" /* CYRILLIC CAPITAL LETTER O WITH DIAERESIS */ +"\x04e7" "o`" /* CYRILLIC SMALL LETTER O WITH DIAERESIS */ +"\x04e8" "O`" /* CYRILLIC CAPITAL LETTER BARRED O */ +"\x04e9" "o`" /* CYRILLIC SMALL LETTER BARRED O */ +"\x04f0" "U`" /* CYRILLIC CAPITAL LETTER U WITH DIAERESIS */ +"\x04f1" "u`" /* CYRILLIC SMALL LETTER U WITH DIAERESIS */ +"\x04f2" "U`" /* CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE */ +"\x04f3" "u`" /* CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE */ +"\x04f4" "CH`" /* CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS */ +"\x04f5" "ch`" /* CYRILLIC SMALL LETTER CHE WITH DIAERESIS */ +"\x04f8" "Y`" /* CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS */ +"\x04f9" "y`" /* CYRILLIC SMALL LETTER YERU WITH DIAERESIS */ "\x2002" " " /* EN SPACE */ "\x2003" " " /* EM SPACE */ "\x2004" " " /* THREE-PER-EM SPACE */ -- 2.17.1