Message ID | 16e785f3-2e9f-ceb2-698f-dc33c91a5d5e@kobylkin.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 716 invoked by alias); 6 Aug 2018 19:00:58 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 686 invoked by uid 89); 6 Aug 2018 19:00:56 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AC_HTML_NONSENSE_TAGS, BAYES_50, BODY_8BITS, GARBLED_BODY, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 spammy=Agency, specifications, LETTER, MESSAGE X-HELO: mout.kundenserver.de From: Egor Kobylkin <egor@kobylkin.com> Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 To: libc-alpha@sourceware.org, libc-locales@sourceware.org Cc: "Dmitry V. Levin" <ldv@altlinux.org>, Volodymyr Lisivka <vlisivka@gmail.com>, Carlos O'Donell <carlos@redhat.com>, Max Kutny <mkutny@gmail.com>, danilo@gnome.org References: <41532e13-a63d-5df1-ab37-05eb4d6c8d0a@kobylkin.com> <20180412224352.GB2911@altlinux.org> Openpgp: preference=signencrypt Message-ID: <16e785f3-2e9f-ceb2-698f-dc33c91a5d5e@kobylkin.com> Date: Mon, 6 Aug 2018 21:00:30 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20180412224352.GB2911@altlinux.org> Content-Type: multipart/mixed; boundary="------------8BBA76D7B64389409B663312" |
Commit Message
Egor Kobylkin
Aug. 6, 2018, 7 p.m. UTC
Dear locale maintainers, fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails" https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1] add Cyrillic transliteration table translit_cyrillic file https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7] to localedata/locales/ and include it in all your locales going forward. Patch included inline below. This is a re-submission for the consideration for 2.29 on a request from Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html From this patch I have excluded locales that already mention cyrillic or have a transliteration table for it: az_AZ iso14651_t1_common ky_KG mn_MN sr_RS tg_TJ tk_TM tt_RU uk_UA uz_UZ uz_UZ@cyrillic Their maintainers are requested to make an explicit decision on how and whether at all to include this patch. Current bug effect: The glibc wiki explicitly lists this use case as the test example https://sourceware.org/glibc/wiki/Locales#Testing_Locales : LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt currently it fails on Cyrillic texts in most locales including ru_RU [1] [8] [9]: LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???. - It produces a string of question marks and spaces. This is what it should produce and it does so after the patch applied: CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe chayu. Root problem and the fix: The root problem is the missing transliteration table that I am supplying here. Furthermore it has to be referenced/included into the active locale at the compilation time to be used by iconv. COMMIT MESSAGE: This translit_cyrillic table enables conversion (e.g. with iconv) from a UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text. While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of a transliteration has only ASCII codes but still can be read by a native speaker. Among other things it is useful for processing the Cyrillic texts and filenames by programs or on systems that are not specifically prepared to work with Cyrillic, don't have corresponding fonts installed or can't handle UTF-8. The transliteration table itself is attached as a file translit_cyrillic [7]. Its content (mapping) is based on GOST 7.79-2000 official source (Federal Agency on Technical Regulating and Metrology Of Russian Federation [2]). Technically an independent but identical source [3] was used and prepared in a spreadsheet [6]. The documentation suggests that the transliteration tables inclusion is done by adding *include "translit_cyrillic";""* string into LC_CTYPE translit_start section http://man7.org/linux/man-pages/man5/locale.5.html [5] Practically I have searched for all locales that have a translit_start/end stance and generated a patch for them. The Cyrillic transliteration of e.g. Russian text may have already worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that have their transliteration tables included inline. However it would not be the standard Russian Cyrillic transliteration as described above. I am excluding these locales from this proposed patch. I have written directly to locale maintainer emails listed in the files. Volodymyr Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA), Данило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the exclusion. Links: [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [2] GOST 7.79-2000 official source http://protect.gost.ru/document.aspx?control=7&id=130715 (is only available in low quality gif format) [3] http://transliteration.ru/gost-7-79-2000/ and http://www.yfermer.ru/specifications/285821.html [4] Wikipedia article on Cyrillic transliteration with Latin alphabet https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9 [5] http://man7.org/linux/man-pages/man5/locale.5.html [6] Spreadsheet for generating translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8590 [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591 [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales [9] translit-test-input.txt https://sourceware.org/bugzilla/attachment.cgi?id=8618 Best regards, Egor Kobylkin --- 2018-07-17 Egor Kobylkin <egor@kobylkin.com> [BZ #2872] * locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration table from Cyrillic to Latin. * locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit section. * locales/aa_DJ: likewise * locales/af_ZA: likewise * locales/ak_GH: likewise * locales/am_ET: likewise * locales/ar_EG: likewise * locales/be_BY: likewise * locales/bem_ZM: likewise * locales/ber_DZ: likewise * locales/ber_MA: likewise * locales/bg_BG: likewise * locales/bi_VU: likewise * locales/bn_BD: likewise * locales/bo_CN: likewise * locales/ca_ES: likewise * locales/ce_RU: likewise * locales/cs_CZ: likewise * locales/cv_RU: likewise * locales/cy_GB: likewise * locales/da_DK: likewise * locales/de_DE: likewise * locales/dv_MV: likewise * locales/dz_BT: likewise * locales/el_GR: likewise * locales/en_GB: likewise * locales/en_NG: likewise * locales/en_ZM: likewise * locales/es_CU: likewise * locales/es_ES: likewise * locales/et_EE: likewise * locales/fa_IR: likewise * locales/ff_SN: likewise * locales/fi_FI: likewise * locales/fr_FR: likewise * locales/ga_IE: likewise * locales/gd_GB: likewise * locales/gu_IN: likewise * locales/gv_GB: likewise * locales/he_IL: likewise * locales/hi_IN: likewise * locales/hif_FJ: likewise * locales/hr_HR: likewise * locales/ht_HT: likewise * locales/hu_HU: likewise * locales/hy_AM: likewise * locales/id_ID: likewise * locales/is_IS: likewise * locales/it_IT: likewise * locales/ja_JP: likewise * locales/kk_KZ: likewise * locales/km_KH: likewise * locales/kn_IN: likewise * locales/ko_KR: likewise * locales/ks_IN: likewise * locales/kw_GB: likewise * locales/lb_LU: likewise * locales/lg_UG: likewise * locales/lij_IT: likewise * locales/ln_CD: likewise * locales/lo_LA: likewise * locales/lt_LT: likewise * locales/lv_LV: likewise * locales/mg_MG: likewise * locales/mhr_RU: likewise * locales/mk_MK: likewise * locales/ml_IN: likewise * locales/ms_MY: likewise * locales/mt_MT: likewise * locales/nan_TW@latin: likewise * locales/nb_NO: likewise * locales/ne_NP: likewise * locales/nhn_MX: likewise * locales/niu_NU: likewise * locales/niu_NZ: likewise * locales/nl_NL: likewise * locales/nr_ZA: likewise * locales/oc_FR: likewise * locales/om_KE: likewise * locales/or_IN: likewise * locales/os_RU: likewise * locales/pa_IN: likewise * locales/pa_PK: likewise * locales/pl_PL: likewise * locales/pt_PT: likewise * locales/quz_PE: likewise * locales/ro_RO: likewise * locales/ru_RU: likewise * locales/rw_RW: likewise * locales/sa_IN: likewise * locales/sd_IN: likewise * locales/sd_IN@devanagari: likewise * locales/sd_PK: likewise * locales/se_NO: likewise * locales/sgs_LT: likewise * locales/si_LK: likewise * locales/sk_SK: likewise * locales/sl_SI: likewise * locales/sm_WS: likewise * locales/so_SO: likewise * locales/sq_AL: likewise * locales/ss_ZA: likewise * locales/st_ZA: likewise * locales/sv_SE: likewise * locales/sw_KE: likewise * locales/ta_IN: likewise * locales/te_IN: likewise * locales/th_TH: likewise * locales/ti_ET: likewise * locales/tn_ZA: likewise * locales/to_TO: likewise * locales/tpi_PG: likewise * locales/tr_TR: likewise * locales/ts_ZA: likewise * locales/unm_US: likewise * locales/ur_IN: likewise * locales/ur_PK: likewise * locales/ve_ZA: likewise * locales/vi_VN: likewise * locales/wa_BE: likewise * locales/wo_SN: likewise * locales/xh_ZA: likewise * locales/yi_US: likewise * locales/zh_CN: likewise * locales/zu_ZA: likewise On 07/17/2018 03:50 PM, Egor Kobylkin wrote: > On 17.07.2018 21:40, Carlos O'Donell wrote: >> On 07/17/2018 03:34 PM, Egor Kobylkin wrote: >>> Dear locale maintainers, >>> >>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails" >> >> We are currently preparing for the 2.28 release and it may take >> a while to review this change and the structure of the changes, >> and the data itself. >> >> Is it OK if this material is reviewed for 2.29 inclusion (after >> August 1st)? > > It's fine with me to postpone it for for 2.29 inclusion (after August 1st). > Should I send a reminder in August? Yes please, ping the original patches again in August and we can review. In the meantime others may feel free to review, but we won't consider them for inclusion yet e.g. don't block the release.
Comments
Ping. Absent of feedback I am wondering if anything could be missing in this patch from the maintainers standpoint. More than two months have passed since the original submission. If I can be of assistance, please do not hesitate to contact me, Egor Kobylkin On 06.08.2018 21:00, Egor Kobylkin wrote: > Dear locale maintainers, > > fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails" > > https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1] > > add Cyrillic transliteration table translit_cyrillic file > > https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7] > > to localedata/locales/ and include it in all your locales going forward. > > Patch included inline below. > > This is a re-submission for the consideration for 2.29 on a request from > Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html > > From this patch I have excluded locales that already mention cyrillic or > have a transliteration table for it: > az_AZ > iso14651_t1_common > ky_KG > mn_MN > sr_RS > tg_TJ > tk_TM > tt_RU > uk_UA > uz_UZ > uz_UZ@cyrillic > > Their maintainers are requested to make an explicit decision on how and > whether at all to include this patch. > > > > Current bug effect: > > The glibc wiki explicitly lists this use case as the test example > > https://sourceware.org/glibc/wiki/Locales#Testing_Locales : > > LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < > translit-test-input.txt > > currently it fails on Cyrillic texts in most locales including ru_RU [1] > [8] [9]: > > LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < > translit-test-input.txt |grep CYRILLIC > > CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???. > > - It produces a string of question marks and spaces. > > This is what it should produce and it does so after the patch applied: > > CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe > chayu. > > > Root problem and the fix: > > The root problem is the missing transliteration table that I am > supplying here. Furthermore it has to be referenced/included into the > active locale at the compilation time to be used by iconv. > > > > COMMIT MESSAGE: > This translit_cyrillic table enables conversion (e.g. with iconv) from a > UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text. > > While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of > a transliteration has only ASCII codes but still can be read by a native > speaker. Among other things it is useful for processing the Cyrillic > texts and filenames by programs or on systems that are not specifically > prepared to work with Cyrillic, don't have corresponding fonts installed > or can't handle UTF-8. > > The transliteration table itself is attached as a file translit_cyrillic > [7]. Its content (mapping) is based on GOST 7.79-2000 official source > (Federal Agency on Technical Regulating and Metrology Of Russian > Federation [2]). Technically an independent but identical source [3] was > used and prepared in a spreadsheet [6]. > > The documentation suggests that the transliteration tables inclusion is > done by adding *include "translit_cyrillic";""* string into LC_CTYPE > translit_start section > http://man7.org/linux/man-pages/man5/locale.5.html [5] > Practically I have searched for all locales that have a > translit_start/end stance and generated a patch for them. > > The Cyrillic transliteration of e.g. Russian text may have already > worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that > have their transliteration tables included inline. > However it would not be the standard Russian Cyrillic transliteration as > described above. > I am excluding these locales from this proposed patch. I have written > directly to locale maintainer emails listed in the files. Volodymyr > Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA), > Данило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the > exclusion. > > Links: > > [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872 > [2] GOST 7.79-2000 official source > http://protect.gost.ru/document.aspx?control=7&id=130715 (is only > available in low quality gif format) > [3] http://transliteration.ru/gost-7-79-2000/ and > http://www.yfermer.ru/specifications/285821.html > [4] Wikipedia article on Cyrillic transliteration with Latin alphabet > https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9 > [5] http://man7.org/linux/man-pages/man5/locale.5.html > [6] Spreadsheet for generating translit_cyrillic > https://sourceware.org/bugzilla/attachment.cgi?id=8590 > [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591 > [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales > [9] translit-test-input.txt > https://sourceware.org/bugzilla/attachment.cgi?id=8618 > > Best regards, > Egor Kobylkin > > --- > 2018-07-17 Egor Kobylkin <egor@kobylkin.com> > > [BZ #2872] > * locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration > table from Cyrillic to Latin. > * locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit > section. > * locales/aa_DJ: likewise > * locales/af_ZA: likewise > * locales/ak_GH: likewise > * locales/am_ET: likewise > * locales/ar_EG: likewise > * locales/be_BY: likewise > * locales/bem_ZM: likewise > * locales/ber_DZ: likewise > * locales/ber_MA: likewise > * locales/bg_BG: likewise > * locales/bi_VU: likewise > * locales/bn_BD: likewise > * locales/bo_CN: likewise > * locales/ca_ES: likewise > * locales/ce_RU: likewise > * locales/cs_CZ: likewise > * locales/cv_RU: likewise > * locales/cy_GB: likewise > * locales/da_DK: likewise > * locales/de_DE: likewise > * locales/dv_MV: likewise > * locales/dz_BT: likewise > * locales/el_GR: likewise > * locales/en_GB: likewise > * locales/en_NG: likewise > * locales/en_ZM: likewise > * locales/es_CU: likewise > * locales/es_ES: likewise > * locales/et_EE: likewise > * locales/fa_IR: likewise > * locales/ff_SN: likewise > * locales/fi_FI: likewise > * locales/fr_FR: likewise > * locales/ga_IE: likewise > * locales/gd_GB: likewise > * locales/gu_IN: likewise > * locales/gv_GB: likewise > * locales/he_IL: likewise > * locales/hi_IN: likewise > * locales/hif_FJ: likewise > * locales/hr_HR: likewise > * locales/ht_HT: likewise > * locales/hu_HU: likewise > * locales/hy_AM: likewise > * locales/id_ID: likewise > * locales/is_IS: likewise > * locales/it_IT: likewise > * locales/ja_JP: likewise > * locales/kk_KZ: likewise > * locales/km_KH: likewise > * locales/kn_IN: likewise > * locales/ko_KR: likewise > * locales/ks_IN: likewise > * locales/kw_GB: likewise > * locales/lb_LU: likewise > * locales/lg_UG: likewise > * locales/lij_IT: likewise > * locales/ln_CD: likewise > * locales/lo_LA: likewise > * locales/lt_LT: likewise > * locales/lv_LV: likewise > * locales/mg_MG: likewise > * locales/mhr_RU: likewise > * locales/mk_MK: likewise > * locales/ml_IN: likewise > * locales/ms_MY: likewise > * locales/mt_MT: likewise > * locales/nan_TW@latin: likewise > * locales/nb_NO: likewise > * locales/ne_NP: likewise > * locales/nhn_MX: likewise > * locales/niu_NU: likewise > * locales/niu_NZ: likewise > * locales/nl_NL: likewise > * locales/nr_ZA: likewise > * locales/oc_FR: likewise > * locales/om_KE: likewise > * locales/or_IN: likewise > * locales/os_RU: likewise > * locales/pa_IN: likewise > * locales/pa_PK: likewise > * locales/pl_PL: likewise > * locales/pt_PT: likewise > * locales/quz_PE: likewise > * locales/ro_RO: likewise > * locales/ru_RU: likewise > * locales/rw_RW: likewise > * locales/sa_IN: likewise > * locales/sd_IN: likewise > * locales/sd_IN@devanagari: likewise > * locales/sd_PK: likewise > * locales/se_NO: likewise > * locales/sgs_LT: likewise > * locales/si_LK: likewise > * locales/sk_SK: likewise > * locales/sl_SI: likewise > * locales/sm_WS: likewise > * locales/so_SO: likewise > * locales/sq_AL: likewise > * locales/ss_ZA: likewise > * locales/st_ZA: likewise > * locales/sv_SE: likewise > * locales/sw_KE: likewise > * locales/ta_IN: likewise > * locales/te_IN: likewise > * locales/th_TH: likewise > * locales/ti_ET: likewise > * locales/tn_ZA: likewise > * locales/to_TO: likewise > * locales/tpi_PG: likewise > * locales/tr_TR: likewise > * locales/ts_ZA: likewise > * locales/unm_US: likewise > * locales/ur_IN: likewise > * locales/ur_PK: likewise > * locales/ve_ZA: likewise > * locales/vi_VN: likewise > * locales/wa_BE: likewise > * locales/wo_SN: likewise > * locales/xh_ZA: likewise > * locales/yi_US: likewise > * locales/zh_CN: likewise > * locales/zu_ZA: likewise > > > diff -uNr a/localedata/locales/C b/localedata/locales/C > --- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000 > @@ -2292,6 +2292,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ > --- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000 > +++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000 > @@ -70,6 +70,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA > --- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000 > +++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000 > @@ -72,6 +72,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH > --- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000 > +++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000 > @@ -56,6 +56,7 @@ > copy "i18n" > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET > --- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000 > +++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000 > @@ -1396,6 +1396,7 @@ > <U137A> <U0060><U0039><U0030> > <U137B> <U0060><U0031><U0030><U0030> > <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> > +include "translit_cyrillic";"" > translit_end > % > END LC_CTYPE > diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG > --- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000 > +++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000 > @@ -44,6 +44,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY > --- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000 > @@ -69,6 +69,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM > --- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000 > @@ -42,6 +42,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ > --- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000 > @@ -166,6 +166,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA > --- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000 > @@ -86,6 +86,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG > --- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000 > @@ -49,6 +49,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU > --- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000 > @@ -39,6 +39,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD > --- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000 > @@ -63,6 +63,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN > --- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000 > @@ -43,6 +43,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES > --- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000 > @@ -72,6 +72,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU > --- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000 > @@ -39,6 +39,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ > --- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000 > +++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000 > @@ -2311,6 +2311,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU > --- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000 > +++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000 > @@ -109,6 +109,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB > --- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000 > +++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000 > @@ -69,6 +69,7 @@ > copy "i18n" > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK > --- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000 > +++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000 > @@ -167,6 +167,7 @@ > % LATIN SMALL LETTER O WITH STROKE -> "oe" > <U00F8> "<U006F><U0338>";"<U006F><U0065>" > > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE > --- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000 > +++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000 > @@ -78,6 +78,7 @@ > % DOUBLE HIGH-REVERSED-9 QUOTATION MARK > <U201F> <U00AB>;<U0022> > > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV > --- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000 > +++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000 > @@ -52,6 +52,7 @@ > include "translit_combining";"" > > > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT > --- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000 > +++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000 > @@ -60,6 +60,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR > --- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000 > +++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000 > @@ -59,6 +59,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB > --- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000 > +++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000 > @@ -55,6 +55,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG > --- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000 > +++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000 > @@ -50,6 +50,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM > --- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000 > +++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000 > @@ -42,6 +42,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU > --- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000 > +++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000 > @@ -59,6 +59,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES > --- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000 > +++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000 > @@ -73,6 +73,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE > --- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000 > +++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000 > @@ -109,6 +109,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR > --- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000 > +++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000 > @@ -79,6 +79,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN > --- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000 > +++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000 > @@ -42,6 +42,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI > --- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000 > +++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000 > @@ -137,6 +137,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR > --- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000 > +++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000 > @@ -59,6 +59,7 @@ > % In France, accents are simply omitted if they cannot be represented. > include "translit_combining";"" > > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE > --- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000 > +++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000 > @@ -54,6 +54,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB > --- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000 > +++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000 > @@ -47,6 +47,7 @@ > copy "i18n" > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN > --- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000 > +++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000 > @@ -62,6 +62,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB > --- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000 > +++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000 > @@ -57,6 +57,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL > --- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000 > +++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000 > @@ -59,6 +59,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN > --- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000 > +++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000 > @@ -61,6 +61,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ > --- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000 > +++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000 > @@ -37,6 +37,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR > --- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000 > +++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000 > @@ -153,6 +153,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT > --- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000 > +++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000 > @@ -59,6 +59,7 @@ > copy "i18n" > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU > --- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000 > +++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000 > @@ -478,6 +478,7 @@ > <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>" > <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>" > > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM > --- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000 > @@ -77,6 +77,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID > --- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000 > @@ -55,6 +55,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS > --- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000 > @@ -2161,6 +2161,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT > --- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000 > @@ -59,6 +59,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP > --- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000 > @@ -1682,6 +1682,7 @@ > include "translit_combining";"" > include "translit_cjk_variants";"" > > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ > --- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000 > @@ -158,6 +158,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH > --- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000 > @@ -873,6 +873,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN > --- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000 > @@ -63,6 +63,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR > --- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000 > @@ -6099,6 +6099,7 @@ > include "translit_combining";"" > include "translit_hangul";"" > > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN > --- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000 > @@ -46,6 +46,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB > --- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000 > @@ -58,6 +58,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU > --- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000 > @@ -78,6 +78,7 @@ > % LATIN SMALL LETTER E WITH CIRCUMFLEX > <U00EA> "<U0065><U005E>" > > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG > --- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000 > @@ -57,6 +57,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT > --- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000 > +++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000 > @@ -47,6 +47,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD > --- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000 > @@ -39,6 +39,7 @@ > copy "i18n" > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA > --- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000 > @@ -51,6 +51,7 @@ > copy "i18n" > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT > --- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000 > @@ -77,6 +77,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV > --- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000 > @@ -2122,6 +2122,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG > --- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000 > @@ -55,6 +55,7 @@ > % Accents are simply omitted if they cannot be represented. > include "translit_combining";"" > > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU > --- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000 > @@ -59,6 +59,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK > --- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000 > @@ -49,6 +49,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN > --- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000 > @@ -60,6 +60,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > % > diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY > --- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000 > @@ -45,6 +45,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT > --- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000 > @@ -47,6 +47,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/nan_TW@latin > b/localedata/locales/nan_TW@latin > --- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000 > @@ -53,6 +53,7 @@ > % accents are simply omitted if they cannot be represented. > include "translit_combining";"" > > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO > --- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000 > @@ -154,6 +154,7 @@ > % LATIN SMALL LETTER O WITH STROKE -> "oe" > <U00F8> "<U006F><U0338>";"<U006F><U0065>" > > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP > --- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000 > @@ -43,6 +43,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX > --- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000 > @@ -60,6 +60,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU > --- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000 > @@ -60,6 +60,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ > --- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000 > @@ -60,6 +60,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL > --- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000 > +++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000 > @@ -57,6 +57,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA > --- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000 > @@ -66,6 +66,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR > --- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000 > @@ -62,6 +62,7 @@ > copy "i18n" > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE > --- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000 > @@ -140,6 +140,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN > --- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000 > @@ -62,6 +62,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU > --- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000 > @@ -70,6 +70,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN > --- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000 > @@ -60,6 +60,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK > --- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000 > @@ -58,6 +58,7 @@ > % Farsi yeh -> yeh > <U06CC> "<U064A>" > > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL > --- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000 > @@ -142,6 +142,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT > --- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000 > @@ -59,6 +59,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE > --- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000 > @@ -57,6 +57,7 @@ > copy "i18n" > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO > --- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000 > @@ -144,6 +144,7 @@ > <U0162> "<U021A>";"<U0054>" > <U0163> "<U021B>";"<U0074>" > > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU > --- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000 > @@ -74,6 +74,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW > --- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000 > @@ -45,6 +45,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN > --- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000 > @@ -44,6 +44,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN > --- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000 > @@ -46,6 +46,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/sd_IN@devanagari > b/localedata/locales/sd_IN@devanagari > --- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000 > +0000 > +++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000 > +0000 > @@ -44,6 +44,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK > --- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000 > @@ -39,6 +39,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO > --- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000 > @@ -205,6 +205,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT > --- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000 > @@ -59,6 +59,7 @@ > copy "i18n" > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK > --- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000 > @@ -45,6 +45,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK > --- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000 > @@ -68,6 +68,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI > --- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000 > +++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000 > @@ -91,6 +91,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS > --- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000 > @@ -37,6 +37,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO > --- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000 > @@ -70,6 +70,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL > --- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000 > @@ -45,6 +45,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA > --- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000 > @@ -68,6 +68,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA > --- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000 > @@ -64,6 +64,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE > --- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000 > @@ -139,6 +139,7 @@ > % LATIN SMALL LETTER O WITH STROKE -> "oe" > <U00F8> "<U006F><U0338>";"<U006F><U0065>" > > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE > --- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000 > @@ -44,6 +44,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN > --- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000 > @@ -63,6 +63,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN > --- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000 > @@ -63,6 +63,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH > --- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000 > @@ -58,6 +58,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET > --- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000 > @@ -866,6 +866,7 @@ > <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> > > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > % > END LC_CTYPE > diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA > --- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000 > @@ -69,6 +69,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO > --- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000 > @@ -36,6 +36,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG > --- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000 > +++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000 > @@ -37,6 +37,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR > --- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000 > @@ -2430,6 +2430,7 @@ > > % TURKISH LIRA SIGN > <U20BA> "<U0054><U004C>" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/translit_cyrillic > b/localedata/locales/translit_cyrillic > --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 > +0000 > +++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000 > +0000 > @@ -0,0 +1,151 @@ > +escape_char / > +comment_char % > + > +% Transliterations that converts cyrillic letters to ascii symbols > inspired by GOST 7.79-2000 > +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872 > +% Generated from UnicodeData.txt with > +% https://sourceware.org/bugzilla/attachment.cgi?id=8590 > +% Up to three characters are required to do a reversible transliteration. > + > +LC_CTYPE > + > +translit_start > + > + > +% CYRILLIC CAPITAL LETTER IO > +<U0401> "<U0059><U004F>";<U0059> > +% CYRILLIC CAPITAL LETTER A > +<U0410> <U0041> > +% CYRILLIC CAPITAL LETTER BE > +<U0411> <U0042> > +% CYRILLIC CAPITAL LETTER VE > +<U0412> <U0056> > +% CYRILLIC CAPITAL LETTER GHE > +<U0413> <U0047> > +% CYRILLIC CAPITAL LETTER DE > +<U0414> <U0044> > +% CYRILLIC CAPITAL LETTER IE > +<U0415> <U0045> > +% CYRILLIC CAPITAL LETTER ZHE > +<U0416> "<U005A><U0048>";<U005A> > +% CYRILLIC CAPITAL LETTER ZE > +<U0417> <U005A> > +% CYRILLIC CAPITAL LETTER I > +<U0418> <U0049> > +% CYRILLIC CAPITAL LETTER SHORT I > +<U0419> <U004A> > +% CYRILLIC CAPITAL LETTER KA > +<U041A> <U004B> > +% CYRILLIC CAPITAL LETTER EL > +<U041B> <U004C> > +% CYRILLIC CAPITAL LETTER EM > +<U041C> <U004D> > +% CYRILLIC CAPITAL LETTER EN > +<U041D> <U004E> > +% CYRILLIC CAPITAL LETTER O > +<U041E> <U004F> > +% CYRILLIC CAPITAL LETTER PE > +<U041F> <U0050> > +% CYRILLIC CAPITAL LETTER ER > +<U0420> <U0052> > +% CYRILLIC CAPITAL LETTER ES > +<U0421> <U0053> > +% CYRILLIC CAPITAL LETTER TE > +<U0422> <U0054> > +% CYRILLIC CAPITAL LETTER U > +<U0423> <U0055> > +% CYRILLIC CAPITAL LETTER EF > +<U0424> <U0046> > +% CYRILLIC CAPITAL LETTER HA > +<U0425> <U0058> > +% CYRILLIC CAPITAL LETTER TSE > +<U0426> "<U0043><U005A>";<U0043> > +% CYRILLIC CAPITAL LETTER CHE > +<U0427> "<U0043><U0048>";<U0043> > +% CYRILLIC CAPITAL LETTER SHA > +<U0428> "<U0053><U0048>";<U0053> > +% CYRILLIC CAPITAL LETTER SHCHA > +<U0429> "<U0053><U0048><U0048>";<U0053> > +% CYRILLIC CAPITAL LETTER HARD SIGN > +<U042A> "<U0060><U0060>";<U0060> > +% CYRILLIC CAPITAL LETTER YERU > +<U042B> "<U0059><U0027>";<U0059> > +% CYRILLIC CAPITAL LETTER SOFT SIGN > +<U042C> <U0060> > +% CYRILLIC CAPITAL LETTER E > +<U042D> "<U0045><U0060>";<U0045> > +% CYRILLIC CAPITAL LETTER YU > +<U042E> "<U0059><U0055>";<U0059> > +% CYRILLIC CAPITAL LETTER YA > +<U042F> "<U0059><U0041>";<U0059> > +% CYRILLIC SMALL LETTER A > +<U0430> <U0061> > +% CYRILLIC SMALL LETTER BE > +<U0431> <U0062> > +% CYRILLIC SMALL LETTER VE > +<U0432> <U0076> > +% CYRILLIC SMALL LETTER GHE > +<U0433> <U0067> > +% CYRILLIC SMALL LETTER DE > +<U0434> <U0064> > +% CYRILLIC SMALL LETTER IE > +<U0435> <U0065> > +% CYRILLIC SMALL LETTER ZHE > +<U0436> "<U007A><U0068>";<U007A> > +% CYRILLIC SMALL LETTER ZE > +<U0437> <U007A> > +% CYRILLIC SMALL LETTER I > +<U0438> <U0069> > +% CYRILLIC SMALL LETTER SHORT I > +<U0439> <U006A> > +% CYRILLIC SMALL LETTER KA > +<U043A> <U006B> > +% CYRILLIC SMALL LETTER EL > +<U043B> <U006C> > +% CYRILLIC SMALL LETTER EM > +<U043C> <U006D> > +% CYRILLIC SMALL LETTER EN > +<U043D> <U006E> > +% CYRILLIC SMALL LETTER O > +<U043E> <U006F> > +% CYRILLIC SMALL LETTER PE > +<U043F> <U0070> > +% CYRILLIC SMALL LETTER ER > +<U0440> <U0072> > +% CYRILLIC SMALL LETTER ES > +<U0441> <U0073> > +% CYRILLIC SMALL LETTER TE > +<U0442> <U0074> > +% CYRILLIC SMALL LETTER U > +<U0443> <U0075> > +% CYRILLIC SMALL LETTER EF > +<U0444> <U0066> > +% CYRILLIC SMALL LETTER HA > +<U0445> <U0078> > +% CYRILLIC SMALL LETTER TSE > +<U0446> "<U0063><U007A>";<U0063> > +% CYRILLIC SMALL LETTER CHE > +<U0447> "<U0063><U0068>";<U0063> > +% CYRILLIC SMALL LETTER SHA > +<U0448> "<U0073><U0068>";<U0073> > +% CYRILLIC SMALL LETTER SHCHA > +<U0449> "<U0073><U0068><U0068>";<U0073> > +% CYRILLIC SMALL LETTER HARD SIGN > +<U044A> "<U0060><U0060>";<U0060> > +% CYRILLIC SMALL LETTER YERU > +<U044B> "<U0079><U0027>";<U0079> > +% CYRILLIC SMALL LETTER SOFT SIGN > +<U044C> <U0060> > +% CYRILLIC SMALL LETTER E > +<U044D> "<U0065><U0060>";<U0065> > +% CYRILLIC SMALL LETTER YU > +<U044E> "<U0079><U0075>";<U0079> > +% CYRILLIC SMALL LETTER YA > +<U044F> "<U0079><U0061>";<U0079> > +% CYRILLIC SMALL LETTER IO > +<U0451> "<U0079><U006F>";<U0079> > + > + > +translit_end > + > +END LC_CTYPE > diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA > --- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000 > @@ -64,6 +64,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US > --- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000 > @@ -48,6 +48,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN > --- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000 > @@ -46,6 +46,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK > --- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000 > @@ -58,6 +58,7 @@ > % Farsi yeh -> yeh > <U06CC> "<U064A>" > > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA > --- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000 > @@ -67,6 +67,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN > --- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000 > @@ -58,6 +58,7 @@ > % dong sign -> d// -> dd > <U20AB> "<U0111>";"<U0064><U0064>" > > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE > --- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000 > @@ -69,6 +69,7 @@ > <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>" > <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>" > > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN > --- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000 > @@ -55,6 +55,7 @@ > % Accents are simply omitted if they cannot be represented. > include "translit_combining";"" > > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA > --- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000 > @@ -66,6 +66,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US > --- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000 > @@ -73,6 +73,7 @@ > <U05F0> "<U05D5><U05D5>";"<U0077><U0077>" > <U05F1> "<U05D5><U05D9>";"<U0077><U006A>" > <U05F2> "<U05D9><U05D9>";"<U006A><U006A>" > +include "translit_cyrillic";"" > translit_end > > END LC_CTYPE > diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN > --- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000 > +++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000 > @@ -58,6 +58,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > > class "hanzi"; / > diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA > --- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000 > +++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000 > @@ -70,6 +70,7 @@ > > translit_start > include "translit_combining";"" > +include "translit_cyrillic";"" > translit_end > END LC_CTYPE > > >
Hi Please note that translitteration of Cyrillic to latin is not universal. There are different schemes for for example German, English and Danish, and there is also an ISO standard for it. But do go forward with fixing this bug. Best regards Keld On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote: > Ping. > > Absent of feedback I am wondering if anything could be missing in this > patch from the maintainers standpoint. More than two months have passed > since the original submission. > > If I can be of assistance, please do not hesitate to contact me, > Egor Kobylkin > > On 06.08.2018 21:00, Egor Kobylkin wrote: > > Dear locale maintainers, > > > > fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails" > > > > https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1] > > > > add Cyrillic transliteration table translit_cyrillic file > > > > https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7] > > > > to localedata/locales/ and include it in all your locales going forward. > > > > Patch included inline below. > > > > This is a re-submission for the consideration for 2.29 on a request from > > Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html > > > > From this patch I have excluded locales that already mention cyrillic or > > have a transliteration table for it: > > az_AZ > > iso14651_t1_common > > ky_KG > > mn_MN > > sr_RS > > tg_TJ > > tk_TM > > tt_RU > > uk_UA > > uz_UZ > > uz_UZ@cyrillic > > > > Their maintainers are requested to make an explicit decision on how and > > whether at all to include this patch. > > > > > > > > Current bug effect: > > > > The glibc wiki explicitly lists this use case as the test example > > > > https://sourceware.org/glibc/wiki/Locales#Testing_Locales : > > > > LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < > > translit-test-input.txt > > > > currently it fails on Cyrillic texts in most locales including ru_RU [1] > > [8] [9]: > > > > LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < > > translit-test-input.txt |grep CYRILLIC > > > > CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???. > > > > - It produces a string of question marks and spaces. > > > > This is what it should produce and it does so after the patch applied: > > > > CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe > > chayu. > > > > > > Root problem and the fix: > > > > The root problem is the missing transliteration table that I am > > supplying here. Furthermore it has to be referenced/included into the > > active locale at the compilation time to be used by iconv. > > > > > > > > COMMIT MESSAGE: > > This translit_cyrillic table enables conversion (e.g. with iconv) from a > > UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text. > > > > While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of > > a transliteration has only ASCII codes but still can be read by a native > > speaker. Among other things it is useful for processing the Cyrillic > > texts and filenames by programs or on systems that are not specifically > > prepared to work with Cyrillic, don't have corresponding fonts installed > > or can't handle UTF-8. > > > > The transliteration table itself is attached as a file translit_cyrillic > > [7]. Its content (mapping) is based on GOST 7.79-2000 official source > > (Federal Agency on Technical Regulating and Metrology Of Russian > > Federation [2]). Technically an independent but identical source [3] was > > used and prepared in a spreadsheet [6]. > > > > The documentation suggests that the transliteration tables inclusion is > > done by adding *include "translit_cyrillic";""* string into LC_CTYPE > > translit_start section > > http://man7.org/linux/man-pages/man5/locale.5.html [5] > > Practically I have searched for all locales that have a > > translit_start/end stance and generated a patch for them. > > > > The Cyrillic transliteration of e.g. Russian text may have already > > worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that > > have their transliteration tables included inline. > > However it would not be the standard Russian Cyrillic transliteration as > > described above. > > I am excluding these locales from this proposed patch. I have written > > directly to locale maintainer emails listed in the files. Volodymyr > > Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA), > > ???????????? ?????????? <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the > > exclusion. > > > > Links: > > > > [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872 > > [2] GOST 7.79-2000 official source > > http://protect.gost.ru/document.aspx?control=7&id=130715 (is only > > available in low quality gif format) > > [3] http://transliteration.ru/gost-7-79-2000/ and > > http://www.yfermer.ru/specifications/285821.html > > [4] Wikipedia article on Cyrillic transliteration with Latin alphabet > > https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9 > > [5] http://man7.org/linux/man-pages/man5/locale.5.html > > [6] Spreadsheet for generating translit_cyrillic > > https://sourceware.org/bugzilla/attachment.cgi?id=8590 > > [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591 > > [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales > > [9] translit-test-input.txt > > https://sourceware.org/bugzilla/attachment.cgi?id=8618 > > > > Best regards, > > Egor Kobylkin > > > > --- > > 2018-07-17 Egor Kobylkin <egor@kobylkin.com> > > > > [BZ #2872] > > * locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration > > table from Cyrillic to Latin. > > * locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit > > section. > > * locales/aa_DJ: likewise > > * locales/af_ZA: likewise > > * locales/ak_GH: likewise > > * locales/am_ET: likewise > > * locales/ar_EG: likewise > > * locales/be_BY: likewise > > * locales/bem_ZM: likewise > > * locales/ber_DZ: likewise > > * locales/ber_MA: likewise > > * locales/bg_BG: likewise > > * locales/bi_VU: likewise > > * locales/bn_BD: likewise > > * locales/bo_CN: likewise > > * locales/ca_ES: likewise > > * locales/ce_RU: likewise > > * locales/cs_CZ: likewise > > * locales/cv_RU: likewise > > * locales/cy_GB: likewise > > * locales/da_DK: likewise > > * locales/de_DE: likewise > > * locales/dv_MV: likewise > > * locales/dz_BT: likewise > > * locales/el_GR: likewise > > * locales/en_GB: likewise > > * locales/en_NG: likewise > > * locales/en_ZM: likewise > > * locales/es_CU: likewise > > * locales/es_ES: likewise > > * locales/et_EE: likewise > > * locales/fa_IR: likewise > > * locales/ff_SN: likewise > > * locales/fi_FI: likewise > > * locales/fr_FR: likewise > > * locales/ga_IE: likewise > > * locales/gd_GB: likewise > > * locales/gu_IN: likewise > > * locales/gv_GB: likewise > > * locales/he_IL: likewise > > * locales/hi_IN: likewise > > * locales/hif_FJ: likewise > > * locales/hr_HR: likewise > > * locales/ht_HT: likewise > > * locales/hu_HU: likewise > > * locales/hy_AM: likewise > > * locales/id_ID: likewise > > * locales/is_IS: likewise > > * locales/it_IT: likewise > > * locales/ja_JP: likewise > > * locales/kk_KZ: likewise > > * locales/km_KH: likewise > > * locales/kn_IN: likewise > > * locales/ko_KR: likewise > > * locales/ks_IN: likewise > > * locales/kw_GB: likewise > > * locales/lb_LU: likewise > > * locales/lg_UG: likewise > > * locales/lij_IT: likewise > > * locales/ln_CD: likewise > > * locales/lo_LA: likewise > > * locales/lt_LT: likewise > > * locales/lv_LV: likewise > > * locales/mg_MG: likewise > > * locales/mhr_RU: likewise > > * locales/mk_MK: likewise > > * locales/ml_IN: likewise > > * locales/ms_MY: likewise > > * locales/mt_MT: likewise > > * locales/nan_TW@latin: likewise > > * locales/nb_NO: likewise > > * locales/ne_NP: likewise > > * locales/nhn_MX: likewise > > * locales/niu_NU: likewise > > * locales/niu_NZ: likewise > > * locales/nl_NL: likewise > > * locales/nr_ZA: likewise > > * locales/oc_FR: likewise > > * locales/om_KE: likewise > > * locales/or_IN: likewise > > * locales/os_RU: likewise > > * locales/pa_IN: likewise > > * locales/pa_PK: likewise > > * locales/pl_PL: likewise > > * locales/pt_PT: likewise > > * locales/quz_PE: likewise > > * locales/ro_RO: likewise > > * locales/ru_RU: likewise > > * locales/rw_RW: likewise > > * locales/sa_IN: likewise > > * locales/sd_IN: likewise > > * locales/sd_IN@devanagari: likewise > > * locales/sd_PK: likewise > > * locales/se_NO: likewise > > * locales/sgs_LT: likewise > > * locales/si_LK: likewise > > * locales/sk_SK: likewise > > * locales/sl_SI: likewise > > * locales/sm_WS: likewise > > * locales/so_SO: likewise > > * locales/sq_AL: likewise > > * locales/ss_ZA: likewise > > * locales/st_ZA: likewise > > * locales/sv_SE: likewise > > * locales/sw_KE: likewise > > * locales/ta_IN: likewise > > * locales/te_IN: likewise > > * locales/th_TH: likewise > > * locales/ti_ET: likewise > > * locales/tn_ZA: likewise > > * locales/to_TO: likewise > > * locales/tpi_PG: likewise > > * locales/tr_TR: likewise > > * locales/ts_ZA: likewise > > * locales/unm_US: likewise > > * locales/ur_IN: likewise > > * locales/ur_PK: likewise > > * locales/ve_ZA: likewise > > * locales/vi_VN: likewise > > * locales/wa_BE: likewise > > * locales/wo_SN: likewise > > * locales/xh_ZA: likewise > > * locales/yi_US: likewise > > * locales/zh_CN: likewise > > * locales/zu_ZA: likewise > > > > > > diff -uNr a/localedata/locales/C b/localedata/locales/C > > --- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000 > > @@ -2292,6 +2292,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ > > --- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000 > > +++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000 > > @@ -70,6 +70,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA > > --- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000 > > +++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000 > > @@ -72,6 +72,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH > > --- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000 > > +++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000 > > @@ -56,6 +56,7 @@ > > copy "i18n" > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET > > --- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000 > > +++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000 > > @@ -1396,6 +1396,7 @@ > > <U137A> <U0060><U0039><U0030> > > <U137B> <U0060><U0031><U0030><U0030> > > <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> > > +include "translit_cyrillic";"" > > translit_end > > % > > END LC_CTYPE > > diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG > > --- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000 > > +++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000 > > @@ -44,6 +44,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY > > --- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000 > > @@ -69,6 +69,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM > > --- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000 > > @@ -42,6 +42,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ > > --- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000 > > @@ -166,6 +166,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA > > --- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000 > > @@ -86,6 +86,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG > > --- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000 > > @@ -49,6 +49,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU > > --- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000 > > @@ -39,6 +39,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD > > --- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000 > > @@ -63,6 +63,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN > > --- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000 > > @@ -43,6 +43,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES > > --- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000 > > @@ -72,6 +72,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU > > --- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000 > > @@ -39,6 +39,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ > > --- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000 > > +++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000 > > @@ -2311,6 +2311,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU > > --- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000 > > +++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000 > > @@ -109,6 +109,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB > > --- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000 > > +++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000 > > @@ -69,6 +69,7 @@ > > copy "i18n" > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK > > --- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000 > > +++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000 > > @@ -167,6 +167,7 @@ > > % LATIN SMALL LETTER O WITH STROKE -> "oe" > > <U00F8> "<U006F><U0338>";"<U006F><U0065>" > > > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE > > --- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000 > > +++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000 > > @@ -78,6 +78,7 @@ > > % DOUBLE HIGH-REVERSED-9 QUOTATION MARK > > <U201F> <U00AB>;<U0022> > > > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV > > --- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000 > > +++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000 > > @@ -52,6 +52,7 @@ > > include "translit_combining";"" > > > > > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT > > --- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000 > > +++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000 > > @@ -60,6 +60,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR > > --- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000 > > +++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000 > > @@ -59,6 +59,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB > > --- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000 > > +++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000 > > @@ -55,6 +55,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG > > --- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000 > > +++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000 > > @@ -50,6 +50,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM > > --- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000 > > +++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000 > > @@ -42,6 +42,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU > > --- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000 > > +++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000 > > @@ -59,6 +59,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES > > --- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000 > > +++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000 > > @@ -73,6 +73,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE > > --- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000 > > +++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000 > > @@ -109,6 +109,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR > > --- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000 > > +++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000 > > @@ -79,6 +79,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN > > --- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000 > > +++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000 > > @@ -42,6 +42,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI > > --- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000 > > +++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000 > > @@ -137,6 +137,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR > > --- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000 > > +++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000 > > @@ -59,6 +59,7 @@ > > % In France, accents are simply omitted if they cannot be represented. > > include "translit_combining";"" > > > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE > > --- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000 > > +++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000 > > @@ -54,6 +54,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB > > --- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000 > > +++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000 > > @@ -47,6 +47,7 @@ > > copy "i18n" > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN > > --- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000 > > +++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000 > > @@ -62,6 +62,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB > > --- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000 > > +++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000 > > @@ -57,6 +57,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL > > --- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000 > > +++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000 > > @@ -59,6 +59,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN > > --- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000 > > +++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000 > > @@ -61,6 +61,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ > > --- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000 > > +++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000 > > @@ -37,6 +37,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR > > --- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000 > > +++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000 > > @@ -153,6 +153,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT > > --- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000 > > +++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000 > > @@ -59,6 +59,7 @@ > > copy "i18n" > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU > > --- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000 > > +++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000 > > @@ -478,6 +478,7 @@ > > <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>" > > <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>" > > > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM > > --- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000 > > @@ -77,6 +77,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID > > --- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000 > > @@ -55,6 +55,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS > > --- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000 > > @@ -2161,6 +2161,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT > > --- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000 > > @@ -59,6 +59,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP > > --- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000 > > @@ -1682,6 +1682,7 @@ > > include "translit_combining";"" > > include "translit_cjk_variants";"" > > > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ > > --- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000 > > @@ -158,6 +158,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH > > --- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000 > > @@ -873,6 +873,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN > > --- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000 > > @@ -63,6 +63,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR > > --- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000 > > @@ -6099,6 +6099,7 @@ > > include "translit_combining";"" > > include "translit_hangul";"" > > > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN > > --- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000 > > @@ -46,6 +46,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB > > --- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000 > > @@ -58,6 +58,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU > > --- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000 > > @@ -78,6 +78,7 @@ > > % LATIN SMALL LETTER E WITH CIRCUMFLEX > > <U00EA> "<U0065><U005E>" > > > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG > > --- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000 > > @@ -57,6 +57,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT > > --- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000 > > +++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000 > > @@ -47,6 +47,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD > > --- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000 > > @@ -39,6 +39,7 @@ > > copy "i18n" > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA > > --- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000 > > @@ -51,6 +51,7 @@ > > copy "i18n" > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT > > --- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000 > > @@ -77,6 +77,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV > > --- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000 > > @@ -2122,6 +2122,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG > > --- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000 > > @@ -55,6 +55,7 @@ > > % Accents are simply omitted if they cannot be represented. > > include "translit_combining";"" > > > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU > > --- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000 > > @@ -59,6 +59,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK > > --- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000 > > @@ -49,6 +49,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN > > --- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000 > > @@ -60,6 +60,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > % > > diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY > > --- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000 > > @@ -45,6 +45,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT > > --- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000 > > @@ -47,6 +47,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/nan_TW@latin > > b/localedata/locales/nan_TW@latin > > --- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000 > > @@ -53,6 +53,7 @@ > > % accents are simply omitted if they cannot be represented. > > include "translit_combining";"" > > > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO > > --- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000 > > @@ -154,6 +154,7 @@ > > % LATIN SMALL LETTER O WITH STROKE -> "oe" > > <U00F8> "<U006F><U0338>";"<U006F><U0065>" > > > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP > > --- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000 > > @@ -43,6 +43,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX > > --- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000 > > @@ -60,6 +60,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU > > --- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000 > > @@ -60,6 +60,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ > > --- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000 > > @@ -60,6 +60,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL > > --- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000 > > +++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000 > > @@ -57,6 +57,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA > > --- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000 > > @@ -66,6 +66,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR > > --- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000 > > @@ -62,6 +62,7 @@ > > copy "i18n" > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE > > --- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000 > > @@ -140,6 +140,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN > > --- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000 > > @@ -62,6 +62,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU > > --- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000 > > @@ -70,6 +70,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN > > --- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000 > > @@ -60,6 +60,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK > > --- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000 > > @@ -58,6 +58,7 @@ > > % Farsi yeh -> yeh > > <U06CC> "<U064A>" > > > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL > > --- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000 > > @@ -142,6 +142,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT > > --- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000 > > @@ -59,6 +59,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE > > --- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000 > > @@ -57,6 +57,7 @@ > > copy "i18n" > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO > > --- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000 > > @@ -144,6 +144,7 @@ > > <U0162> "<U021A>";"<U0054>" > > <U0163> "<U021B>";"<U0074>" > > > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU > > --- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000 > > @@ -74,6 +74,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW > > --- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000 > > @@ -45,6 +45,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN > > --- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000 > > @@ -44,6 +44,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN > > --- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000 > > @@ -46,6 +46,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/sd_IN@devanagari > > b/localedata/locales/sd_IN@devanagari > > --- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000 > > +0000 > > +++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000 > > +0000 > > @@ -44,6 +44,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK > > --- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000 > > @@ -39,6 +39,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO > > --- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000 > > @@ -205,6 +205,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT > > --- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000 > > @@ -59,6 +59,7 @@ > > copy "i18n" > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK > > --- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000 > > @@ -45,6 +45,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK > > --- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000 > > @@ -68,6 +68,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI > > --- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000 > > +++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000 > > @@ -91,6 +91,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS > > --- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000 > > @@ -37,6 +37,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO > > --- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000 > > @@ -70,6 +70,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL > > --- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000 > > @@ -45,6 +45,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA > > --- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000 > > @@ -68,6 +68,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA > > --- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000 > > @@ -64,6 +64,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE > > --- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000 > > @@ -139,6 +139,7 @@ > > % LATIN SMALL LETTER O WITH STROKE -> "oe" > > <U00F8> "<U006F><U0338>";"<U006F><U0065>" > > > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE > > --- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000 > > @@ -44,6 +44,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN > > --- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000 > > @@ -63,6 +63,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN > > --- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000 > > @@ -63,6 +63,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH > > --- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000 > > @@ -58,6 +58,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET > > --- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000 > > @@ -866,6 +866,7 @@ > > <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> > > > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > % > > END LC_CTYPE > > diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA > > --- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000 > > @@ -69,6 +69,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO > > --- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000 > > @@ -36,6 +36,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG > > --- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000 > > +++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000 > > @@ -37,6 +37,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR > > --- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000 > > @@ -2430,6 +2430,7 @@ > > > > % TURKISH LIRA SIGN > > <U20BA> "<U0054><U004C>" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/translit_cyrillic > > b/localedata/locales/translit_cyrillic > > --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 > > +0000 > > +++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000 > > +0000 > > @@ -0,0 +1,151 @@ > > +escape_char / > > +comment_char % > > + > > +% Transliterations that converts cyrillic letters to ascii symbols > > inspired by GOST 7.79-2000 > > +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872 > > +% Generated from UnicodeData.txt with > > +% https://sourceware.org/bugzilla/attachment.cgi?id=8590 > > +% Up to three characters are required to do a reversible transliteration. > > + > > +LC_CTYPE > > + > > +translit_start > > + > > + > > +% CYRILLIC CAPITAL LETTER IO > > +<U0401> "<U0059><U004F>";<U0059> > > +% CYRILLIC CAPITAL LETTER A > > +<U0410> <U0041> > > +% CYRILLIC CAPITAL LETTER BE > > +<U0411> <U0042> > > +% CYRILLIC CAPITAL LETTER VE > > +<U0412> <U0056> > > +% CYRILLIC CAPITAL LETTER GHE > > +<U0413> <U0047> > > +% CYRILLIC CAPITAL LETTER DE > > +<U0414> <U0044> > > +% CYRILLIC CAPITAL LETTER IE > > +<U0415> <U0045> > > +% CYRILLIC CAPITAL LETTER ZHE > > +<U0416> "<U005A><U0048>";<U005A> > > +% CYRILLIC CAPITAL LETTER ZE > > +<U0417> <U005A> > > +% CYRILLIC CAPITAL LETTER I > > +<U0418> <U0049> > > +% CYRILLIC CAPITAL LETTER SHORT I > > +<U0419> <U004A> > > +% CYRILLIC CAPITAL LETTER KA > > +<U041A> <U004B> > > +% CYRILLIC CAPITAL LETTER EL > > +<U041B> <U004C> > > +% CYRILLIC CAPITAL LETTER EM > > +<U041C> <U004D> > > +% CYRILLIC CAPITAL LETTER EN > > +<U041D> <U004E> > > +% CYRILLIC CAPITAL LETTER O > > +<U041E> <U004F> > > +% CYRILLIC CAPITAL LETTER PE > > +<U041F> <U0050> > > +% CYRILLIC CAPITAL LETTER ER > > +<U0420> <U0052> > > +% CYRILLIC CAPITAL LETTER ES > > +<U0421> <U0053> > > +% CYRILLIC CAPITAL LETTER TE > > +<U0422> <U0054> > > +% CYRILLIC CAPITAL LETTER U > > +<U0423> <U0055> > > +% CYRILLIC CAPITAL LETTER EF > > +<U0424> <U0046> > > +% CYRILLIC CAPITAL LETTER HA > > +<U0425> <U0058> > > +% CYRILLIC CAPITAL LETTER TSE > > +<U0426> "<U0043><U005A>";<U0043> > > +% CYRILLIC CAPITAL LETTER CHE > > +<U0427> "<U0043><U0048>";<U0043> > > +% CYRILLIC CAPITAL LETTER SHA > > +<U0428> "<U0053><U0048>";<U0053> > > +% CYRILLIC CAPITAL LETTER SHCHA > > +<U0429> "<U0053><U0048><U0048>";<U0053> > > +% CYRILLIC CAPITAL LETTER HARD SIGN > > +<U042A> "<U0060><U0060>";<U0060> > > +% CYRILLIC CAPITAL LETTER YERU > > +<U042B> "<U0059><U0027>";<U0059> > > +% CYRILLIC CAPITAL LETTER SOFT SIGN > > +<U042C> <U0060> > > +% CYRILLIC CAPITAL LETTER E > > +<U042D> "<U0045><U0060>";<U0045> > > +% CYRILLIC CAPITAL LETTER YU > > +<U042E> "<U0059><U0055>";<U0059> > > +% CYRILLIC CAPITAL LETTER YA > > +<U042F> "<U0059><U0041>";<U0059> > > +% CYRILLIC SMALL LETTER A > > +<U0430> <U0061> > > +% CYRILLIC SMALL LETTER BE > > +<U0431> <U0062> > > +% CYRILLIC SMALL LETTER VE > > +<U0432> <U0076> > > +% CYRILLIC SMALL LETTER GHE > > +<U0433> <U0067> > > +% CYRILLIC SMALL LETTER DE > > +<U0434> <U0064> > > +% CYRILLIC SMALL LETTER IE > > +<U0435> <U0065> > > +% CYRILLIC SMALL LETTER ZHE > > +<U0436> "<U007A><U0068>";<U007A> > > +% CYRILLIC SMALL LETTER ZE > > +<U0437> <U007A> > > +% CYRILLIC SMALL LETTER I > > +<U0438> <U0069> > > +% CYRILLIC SMALL LETTER SHORT I > > +<U0439> <U006A> > > +% CYRILLIC SMALL LETTER KA > > +<U043A> <U006B> > > +% CYRILLIC SMALL LETTER EL > > +<U043B> <U006C> > > +% CYRILLIC SMALL LETTER EM > > +<U043C> <U006D> > > +% CYRILLIC SMALL LETTER EN > > +<U043D> <U006E> > > +% CYRILLIC SMALL LETTER O > > +<U043E> <U006F> > > +% CYRILLIC SMALL LETTER PE > > +<U043F> <U0070> > > +% CYRILLIC SMALL LETTER ER > > +<U0440> <U0072> > > +% CYRILLIC SMALL LETTER ES > > +<U0441> <U0073> > > +% CYRILLIC SMALL LETTER TE > > +<U0442> <U0074> > > +% CYRILLIC SMALL LETTER U > > +<U0443> <U0075> > > +% CYRILLIC SMALL LETTER EF > > +<U0444> <U0066> > > +% CYRILLIC SMALL LETTER HA > > +<U0445> <U0078> > > +% CYRILLIC SMALL LETTER TSE > > +<U0446> "<U0063><U007A>";<U0063> > > +% CYRILLIC SMALL LETTER CHE > > +<U0447> "<U0063><U0068>";<U0063> > > +% CYRILLIC SMALL LETTER SHA > > +<U0448> "<U0073><U0068>";<U0073> > > +% CYRILLIC SMALL LETTER SHCHA > > +<U0449> "<U0073><U0068><U0068>";<U0073> > > +% CYRILLIC SMALL LETTER HARD SIGN > > +<U044A> "<U0060><U0060>";<U0060> > > +% CYRILLIC SMALL LETTER YERU > > +<U044B> "<U0079><U0027>";<U0079> > > +% CYRILLIC SMALL LETTER SOFT SIGN > > +<U044C> <U0060> > > +% CYRILLIC SMALL LETTER E > > +<U044D> "<U0065><U0060>";<U0065> > > +% CYRILLIC SMALL LETTER YU > > +<U044E> "<U0079><U0075>";<U0079> > > +% CYRILLIC SMALL LETTER YA > > +<U044F> "<U0079><U0061>";<U0079> > > +% CYRILLIC SMALL LETTER IO > > +<U0451> "<U0079><U006F>";<U0079> > > + > > + > > +translit_end > > + > > +END LC_CTYPE > > diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA > > --- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000 > > @@ -64,6 +64,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US > > --- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000 > > @@ -48,6 +48,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN > > --- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000 > > @@ -46,6 +46,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK > > --- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000 > > @@ -58,6 +58,7 @@ > > % Farsi yeh -> yeh > > <U06CC> "<U064A>" > > > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA > > --- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000 > > @@ -67,6 +67,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN > > --- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000 > > @@ -58,6 +58,7 @@ > > % dong sign -> d// -> dd > > <U20AB> "<U0111>";"<U0064><U0064>" > > > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE > > --- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000 > > @@ -69,6 +69,7 @@ > > <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>" > > <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>" > > > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN > > --- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000 > > @@ -55,6 +55,7 @@ > > % Accents are simply omitted if they cannot be represented. > > include "translit_combining";"" > > > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA > > --- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000 > > @@ -66,6 +66,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US > > --- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000 > > @@ -73,6 +73,7 @@ > > <U05F0> "<U05D5><U05D5>";"<U0077><U0077>" > > <U05F1> "<U05D5><U05D9>";"<U0077><U006A>" > > <U05F2> "<U05D9><U05D9>";"<U006A><U006A>" > > +include "translit_cyrillic";"" > > translit_end > > > > END LC_CTYPE > > diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN > > --- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000 > > +++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000 > > @@ -58,6 +58,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > > > class "hanzi"; / > > diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA > > --- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000 > > +++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000 > > @@ -70,6 +70,7 @@ > > > > translit_start > > include "translit_combining";"" > > +include "translit_cyrillic";"" > > translit_end > > END LC_CTYPE > > > > > >
On 03.10.2018 11:19, Keld Simonsen wrote: > Hi > > Please note that translitteration of Cyrillic to latin is not universal. > There are different schemes for for example German, English and Danish, and > there is also an ISO standard for it. Thanks for your feedback, Keld! Could the locale maintainers that wouldn't like to include this patch explicitly state so here? That is: - In the case that there is a different preferred cyrillic transliteration table for any specific locale their maintainers may want to point me to it so I can supply a separate table/patch. - Or they could state explicitly that for some reason they would like to exclude their locale from the patch for a default cyrillic transliteration altogether. --Egor > > But do go forward with fixing this bug. > > Best regards > Keld > > On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote: >> Ping. >> >> Absent of feedback I am wondering if anything could be missing in this >> patch from the maintainers standpoint. More than two months have passed >> since the original submission. >> >> If I can be of assistance, please do not hesitate to contact me, >> Egor Kobylkin >> >> On 06.08.2018 21:00, Egor Kobylkin wrote: >>> Dear locale maintainers, >>> >>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails" >>> >>> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1] >>> >>> add Cyrillic transliteration table translit_cyrillic file >>> >>> https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7] >>> >>> to localedata/locales/ and include it in all your locales going forward. >>> >>> Patch included inline below. >>> >>> This is a re-submission for the consideration for 2.29 on a request from >>> Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html >>> >>> From this patch I have excluded locales that already mention cyrillic or >>> have a transliteration table for it: >>> az_AZ >>> iso14651_t1_common >>> ky_KG >>> mn_MN >>> sr_RS >>> tg_TJ >>> tk_TM >>> tt_RU >>> uk_UA >>> uz_UZ >>> uz_UZ@cyrillic >>> >>> Their maintainers are requested to make an explicit decision on how and >>> whether at all to include this patch. >>> >>> >>> >>> Current bug effect: >>> >>> The glibc wiki explicitly lists this use case as the test example >>> >>> https://sourceware.org/glibc/wiki/Locales#Testing_Locales : >>> >>> LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < >>> translit-test-input.txt >>> >>> currently it fails on Cyrillic texts in most locales including ru_RU [1] >>> [8] [9]: >>> >>> LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < >>> translit-test-input.txt |grep CYRILLIC >>> >>> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???. >>> >>> - It produces a string of question marks and spaces. >>> >>> This is what it should produce and it does so after the patch applied: >>> >>> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe >>> chayu. >>> >>> >>> Root problem and the fix: >>> >>> The root problem is the missing transliteration table that I am >>> supplying here. Furthermore it has to be referenced/included into the >>> active locale at the compilation time to be used by iconv. >>> >>> >>> >>> COMMIT MESSAGE: >>> This translit_cyrillic table enables conversion (e.g. with iconv) from a >>> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text. >>> >>> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of >>> a transliteration has only ASCII codes but still can be read by a native >>> speaker. Among other things it is useful for processing the Cyrillic >>> texts and filenames by programs or on systems that are not specifically >>> prepared to work with Cyrillic, don't have corresponding fonts installed >>> or can't handle UTF-8. >>> >>> The transliteration table itself is attached as a file translit_cyrillic >>> [7]. Its content (mapping) is based on GOST 7.79-2000 official source >>> (Federal Agency on Technical Regulating and Metrology Of Russian >>> Federation [2]). Technically an independent but identical source [3] was >>> used and prepared in a spreadsheet [6]. >>> >>> The documentation suggests that the transliteration tables inclusion is >>> done by adding *include "translit_cyrillic";""* string into LC_CTYPE >>> translit_start section >>> http://man7.org/linux/man-pages/man5/locale.5.html [5] >>> Practically I have searched for all locales that have a >>> translit_start/end stance and generated a patch for them. >>> >>> The Cyrillic transliteration of e.g. Russian text may have already >>> worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that >>> have their transliteration tables included inline. >>> However it would not be the standard Russian Cyrillic transliteration as >>> described above. >>> I am excluding these locales from this proposed patch. I have written >>> directly to locale maintainer emails listed in the files. Volodymyr >>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA), >>> ???????????? ?????????? <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the >>> exclusion. >>> >>> Links: >>> >>> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872 >>> [2] GOST 7.79-2000 official source >>> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only >>> available in low quality gif format) >>> [3] http://transliteration.ru/gost-7-79-2000/ and >>> http://www.yfermer.ru/specifications/285821.html >>> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet >>> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9 >>> [5] http://man7.org/linux/man-pages/man5/locale.5.html >>> [6] Spreadsheet for generating translit_cyrillic >>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 >>> [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591 >>> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales >>> [9] translit-test-input.txt >>> https://sourceware.org/bugzilla/attachment.cgi?id=8618 >>> >>> Best regards, >>> Egor Kobylkin >>> >>> --- >>> 2018-07-17 Egor Kobylkin <egor@kobylkin.com> >>> >>> [BZ #2872] >>> * locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration >>> table from Cyrillic to Latin. >>> * locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit >>> section. >>> * locales/aa_DJ: likewise >>> * locales/af_ZA: likewise >>> * locales/ak_GH: likewise >>> * locales/am_ET: likewise >>> * locales/ar_EG: likewise >>> * locales/be_BY: likewise >>> * locales/bem_ZM: likewise >>> * locales/ber_DZ: likewise >>> * locales/ber_MA: likewise >>> * locales/bg_BG: likewise >>> * locales/bi_VU: likewise >>> * locales/bn_BD: likewise >>> * locales/bo_CN: likewise >>> * locales/ca_ES: likewise >>> * locales/ce_RU: likewise >>> * locales/cs_CZ: likewise >>> * locales/cv_RU: likewise >>> * locales/cy_GB: likewise >>> * locales/da_DK: likewise >>> * locales/de_DE: likewise >>> * locales/dv_MV: likewise >>> * locales/dz_BT: likewise >>> * locales/el_GR: likewise >>> * locales/en_GB: likewise >>> * locales/en_NG: likewise >>> * locales/en_ZM: likewise >>> * locales/es_CU: likewise >>> * locales/es_ES: likewise >>> * locales/et_EE: likewise >>> * locales/fa_IR: likewise >>> * locales/ff_SN: likewise >>> * locales/fi_FI: likewise >>> * locales/fr_FR: likewise >>> * locales/ga_IE: likewise >>> * locales/gd_GB: likewise >>> * locales/gu_IN: likewise >>> * locales/gv_GB: likewise >>> * locales/he_IL: likewise >>> * locales/hi_IN: likewise >>> * locales/hif_FJ: likewise >>> * locales/hr_HR: likewise >>> * locales/ht_HT: likewise >>> * locales/hu_HU: likewise >>> * locales/hy_AM: likewise >>> * locales/id_ID: likewise >>> * locales/is_IS: likewise >>> * locales/it_IT: likewise >>> * locales/ja_JP: likewise >>> * locales/kk_KZ: likewise >>> * locales/km_KH: likewise >>> * locales/kn_IN: likewise >>> * locales/ko_KR: likewise >>> * locales/ks_IN: likewise >>> * locales/kw_GB: likewise >>> * locales/lb_LU: likewise >>> * locales/lg_UG: likewise >>> * locales/lij_IT: likewise >>> * locales/ln_CD: likewise >>> * locales/lo_LA: likewise >>> * locales/lt_LT: likewise >>> * locales/lv_LV: likewise >>> * locales/mg_MG: likewise >>> * locales/mhr_RU: likewise >>> * locales/mk_MK: likewise >>> * locales/ml_IN: likewise >>> * locales/ms_MY: likewise >>> * locales/mt_MT: likewise >>> * locales/nan_TW@latin: likewise >>> * locales/nb_NO: likewise >>> * locales/ne_NP: likewise >>> * locales/nhn_MX: likewise >>> * locales/niu_NU: likewise >>> * locales/niu_NZ: likewise >>> * locales/nl_NL: likewise >>> * locales/nr_ZA: likewise >>> * locales/oc_FR: likewise >>> * locales/om_KE: likewise >>> * locales/or_IN: likewise >>> * locales/os_RU: likewise >>> * locales/pa_IN: likewise >>> * locales/pa_PK: likewise >>> * locales/pl_PL: likewise >>> * locales/pt_PT: likewise >>> * locales/quz_PE: likewise >>> * locales/ro_RO: likewise >>> * locales/ru_RU: likewise >>> * locales/rw_RW: likewise >>> * locales/sa_IN: likewise >>> * locales/sd_IN: likewise >>> * locales/sd_IN@devanagari: likewise >>> * locales/sd_PK: likewise >>> * locales/se_NO: likewise >>> * locales/sgs_LT: likewise >>> * locales/si_LK: likewise >>> * locales/sk_SK: likewise >>> * locales/sl_SI: likewise >>> * locales/sm_WS: likewise >>> * locales/so_SO: likewise >>> * locales/sq_AL: likewise >>> * locales/ss_ZA: likewise >>> * locales/st_ZA: likewise >>> * locales/sv_SE: likewise >>> * locales/sw_KE: likewise >>> * locales/ta_IN: likewise >>> * locales/te_IN: likewise >>> * locales/th_TH: likewise >>> * locales/ti_ET: likewise >>> * locales/tn_ZA: likewise >>> * locales/to_TO: likewise >>> * locales/tpi_PG: likewise >>> * locales/tr_TR: likewise >>> * locales/ts_ZA: likewise >>> * locales/unm_US: likewise >>> * locales/ur_IN: likewise >>> * locales/ur_PK: likewise >>> * locales/ve_ZA: likewise >>> * locales/vi_VN: likewise >>> * locales/wa_BE: likewise >>> * locales/wo_SN: likewise >>> * locales/xh_ZA: likewise >>> * locales/yi_US: likewise >>> * locales/zh_CN: likewise >>> * locales/zu_ZA: likewise >>> >>> >>> diff -uNr a/localedata/locales/C b/localedata/locales/C >>> --- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000 >>> @@ -2292,6 +2292,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ >>> --- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000 >>> +++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000 >>> @@ -70,6 +70,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA >>> --- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000 >>> +++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000 >>> @@ -72,6 +72,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH >>> --- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000 >>> +++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000 >>> @@ -56,6 +56,7 @@ >>> copy "i18n" >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET >>> --- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000 >>> +++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000 >>> @@ -1396,6 +1396,7 @@ >>> <U137A> <U0060><U0039><U0030> >>> <U137B> <U0060><U0031><U0030><U0030> >>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> >>> +include "translit_cyrillic";"" >>> translit_end >>> % >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG >>> --- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000 >>> +++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000 >>> @@ -44,6 +44,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY >>> --- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000 >>> @@ -69,6 +69,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM >>> --- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000 >>> @@ -42,6 +42,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ >>> --- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000 >>> @@ -166,6 +166,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA >>> --- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000 >>> @@ -86,6 +86,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG >>> --- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000 >>> @@ -49,6 +49,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU >>> --- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000 >>> @@ -39,6 +39,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD >>> --- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000 >>> @@ -63,6 +63,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN >>> --- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000 >>> @@ -43,6 +43,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES >>> --- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000 >>> @@ -72,6 +72,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU >>> --- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000 >>> @@ -39,6 +39,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ >>> --- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000 >>> +++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000 >>> @@ -2311,6 +2311,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU >>> --- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000 >>> +++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000 >>> @@ -109,6 +109,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB >>> --- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000 >>> +++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000 >>> @@ -69,6 +69,7 @@ >>> copy "i18n" >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK >>> --- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000 >>> +++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000 >>> @@ -167,6 +167,7 @@ >>> % LATIN SMALL LETTER O WITH STROKE -> "oe" >>> <U00F8> "<U006F><U0338>";"<U006F><U0065>" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE >>> --- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000 >>> +++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000 >>> @@ -78,6 +78,7 @@ >>> % DOUBLE HIGH-REVERSED-9 QUOTATION MARK >>> <U201F> <U00AB>;<U0022> >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV >>> --- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000 >>> +++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000 >>> @@ -52,6 +52,7 @@ >>> include "translit_combining";"" >>> >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT >>> --- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000 >>> +++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000 >>> @@ -60,6 +60,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR >>> --- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000 >>> +++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000 >>> @@ -59,6 +59,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB >>> --- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000 >>> +++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000 >>> @@ -55,6 +55,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG >>> --- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000 >>> +++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000 >>> @@ -50,6 +50,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM >>> --- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000 >>> +++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000 >>> @@ -42,6 +42,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU >>> --- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000 >>> +++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000 >>> @@ -59,6 +59,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES >>> --- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000 >>> +++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000 >>> @@ -73,6 +73,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE >>> --- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000 >>> +++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000 >>> @@ -109,6 +109,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR >>> --- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000 >>> +++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000 >>> @@ -79,6 +79,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN >>> --- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000 >>> +++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000 >>> @@ -42,6 +42,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI >>> --- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000 >>> +++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000 >>> @@ -137,6 +137,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR >>> --- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000 >>> +++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000 >>> @@ -59,6 +59,7 @@ >>> % In France, accents are simply omitted if they cannot be represented. >>> include "translit_combining";"" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE >>> --- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000 >>> +++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000 >>> @@ -54,6 +54,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB >>> --- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000 >>> +++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000 >>> @@ -47,6 +47,7 @@ >>> copy "i18n" >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN >>> --- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000 >>> +++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000 >>> @@ -62,6 +62,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB >>> --- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000 >>> +++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000 >>> @@ -57,6 +57,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL >>> --- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000 >>> +++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000 >>> @@ -59,6 +59,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN >>> --- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000 >>> +++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000 >>> @@ -61,6 +61,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ >>> --- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000 >>> +++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000 >>> @@ -37,6 +37,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR >>> --- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000 >>> +++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000 >>> @@ -153,6 +153,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT >>> --- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000 >>> +++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000 >>> @@ -59,6 +59,7 @@ >>> copy "i18n" >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU >>> --- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000 >>> +++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000 >>> @@ -478,6 +478,7 @@ >>> <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>" >>> <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM >>> --- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000 >>> @@ -77,6 +77,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID >>> --- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000 >>> @@ -55,6 +55,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS >>> --- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000 >>> @@ -2161,6 +2161,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT >>> --- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000 >>> @@ -59,6 +59,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP >>> --- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000 >>> @@ -1682,6 +1682,7 @@ >>> include "translit_combining";"" >>> include "translit_cjk_variants";"" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ >>> --- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000 >>> @@ -158,6 +158,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH >>> --- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000 >>> @@ -873,6 +873,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN >>> --- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000 >>> @@ -63,6 +63,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR >>> --- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000 >>> @@ -6099,6 +6099,7 @@ >>> include "translit_combining";"" >>> include "translit_hangul";"" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN >>> --- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000 >>> @@ -46,6 +46,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB >>> --- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000 >>> @@ -58,6 +58,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU >>> --- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000 >>> @@ -78,6 +78,7 @@ >>> % LATIN SMALL LETTER E WITH CIRCUMFLEX >>> <U00EA> "<U0065><U005E>" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG >>> --- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000 >>> @@ -57,6 +57,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT >>> --- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000 >>> +++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000 >>> @@ -47,6 +47,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD >>> --- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000 >>> @@ -39,6 +39,7 @@ >>> copy "i18n" >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA >>> --- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000 >>> @@ -51,6 +51,7 @@ >>> copy "i18n" >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT >>> --- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000 >>> @@ -77,6 +77,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV >>> --- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000 >>> @@ -2122,6 +2122,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG >>> --- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000 >>> @@ -55,6 +55,7 @@ >>> % Accents are simply omitted if they cannot be represented. >>> include "translit_combining";"" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU >>> --- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000 >>> @@ -59,6 +59,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK >>> --- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000 >>> @@ -49,6 +49,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN >>> --- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000 >>> @@ -60,6 +60,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> % >>> diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY >>> --- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000 >>> @@ -45,6 +45,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT >>> --- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000 >>> @@ -47,6 +47,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/nan_TW@latin >>> b/localedata/locales/nan_TW@latin >>> --- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000 >>> @@ -53,6 +53,7 @@ >>> % accents are simply omitted if they cannot be represented. >>> include "translit_combining";"" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO >>> --- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000 >>> @@ -154,6 +154,7 @@ >>> % LATIN SMALL LETTER O WITH STROKE -> "oe" >>> <U00F8> "<U006F><U0338>";"<U006F><U0065>" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP >>> --- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000 >>> @@ -43,6 +43,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX >>> --- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000 >>> @@ -60,6 +60,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU >>> --- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000 >>> @@ -60,6 +60,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ >>> --- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000 >>> @@ -60,6 +60,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL >>> --- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000 >>> +++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000 >>> @@ -57,6 +57,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA >>> --- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000 >>> @@ -66,6 +66,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR >>> --- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000 >>> @@ -62,6 +62,7 @@ >>> copy "i18n" >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE >>> --- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000 >>> @@ -140,6 +140,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN >>> --- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000 >>> @@ -62,6 +62,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU >>> --- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000 >>> @@ -70,6 +70,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN >>> --- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000 >>> @@ -60,6 +60,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK >>> --- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000 >>> @@ -58,6 +58,7 @@ >>> % Farsi yeh -> yeh >>> <U06CC> "<U064A>" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL >>> --- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000 >>> @@ -142,6 +142,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT >>> --- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000 >>> @@ -59,6 +59,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE >>> --- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000 >>> @@ -57,6 +57,7 @@ >>> copy "i18n" >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO >>> --- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000 >>> @@ -144,6 +144,7 @@ >>> <U0162> "<U021A>";"<U0054>" >>> <U0163> "<U021B>";"<U0074>" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU >>> --- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000 >>> @@ -74,6 +74,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW >>> --- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000 >>> @@ -45,6 +45,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN >>> --- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000 >>> @@ -44,6 +44,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN >>> --- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000 >>> @@ -46,6 +46,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/sd_IN@devanagari >>> b/localedata/locales/sd_IN@devanagari >>> --- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000 >>> +0000 >>> +++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000 >>> +0000 >>> @@ -44,6 +44,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK >>> --- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000 >>> @@ -39,6 +39,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO >>> --- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000 >>> @@ -205,6 +205,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT >>> --- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000 >>> @@ -59,6 +59,7 @@ >>> copy "i18n" >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK >>> --- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000 >>> @@ -45,6 +45,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK >>> --- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000 >>> @@ -68,6 +68,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI >>> --- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000 >>> +++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000 >>> @@ -91,6 +91,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS >>> --- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000 >>> @@ -37,6 +37,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO >>> --- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000 >>> @@ -70,6 +70,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL >>> --- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000 >>> @@ -45,6 +45,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA >>> --- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000 >>> @@ -68,6 +68,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA >>> --- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000 >>> @@ -64,6 +64,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE >>> --- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000 >>> @@ -139,6 +139,7 @@ >>> % LATIN SMALL LETTER O WITH STROKE -> "oe" >>> <U00F8> "<U006F><U0338>";"<U006F><U0065>" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE >>> --- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000 >>> @@ -44,6 +44,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN >>> --- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000 >>> @@ -63,6 +63,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN >>> --- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000 >>> @@ -63,6 +63,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH >>> --- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000 >>> @@ -58,6 +58,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET >>> --- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000 >>> @@ -866,6 +866,7 @@ >>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> >>> >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> % >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA >>> --- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000 >>> @@ -69,6 +69,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO >>> --- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000 >>> @@ -36,6 +36,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG >>> --- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000 >>> +++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000 >>> @@ -37,6 +37,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR >>> --- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000 >>> @@ -2430,6 +2430,7 @@ >>> >>> % TURKISH LIRA SIGN >>> <U20BA> "<U0054><U004C>" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/translit_cyrillic >>> b/localedata/locales/translit_cyrillic >>> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 >>> +0000 >>> +++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000 >>> +0000 >>> @@ -0,0 +1,151 @@ >>> +escape_char / >>> +comment_char % >>> + >>> +% Transliterations that converts cyrillic letters to ascii symbols >>> inspired by GOST 7.79-2000 >>> +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872 >>> +% Generated from UnicodeData.txt with >>> +% https://sourceware.org/bugzilla/attachment.cgi?id=8590 >>> +% Up to three characters are required to do a reversible transliteration. >>> + >>> +LC_CTYPE >>> + >>> +translit_start >>> + >>> + >>> +% CYRILLIC CAPITAL LETTER IO >>> +<U0401> "<U0059><U004F>";<U0059> >>> +% CYRILLIC CAPITAL LETTER A >>> +<U0410> <U0041> >>> +% CYRILLIC CAPITAL LETTER BE >>> +<U0411> <U0042> >>> +% CYRILLIC CAPITAL LETTER VE >>> +<U0412> <U0056> >>> +% CYRILLIC CAPITAL LETTER GHE >>> +<U0413> <U0047> >>> +% CYRILLIC CAPITAL LETTER DE >>> +<U0414> <U0044> >>> +% CYRILLIC CAPITAL LETTER IE >>> +<U0415> <U0045> >>> +% CYRILLIC CAPITAL LETTER ZHE >>> +<U0416> "<U005A><U0048>";<U005A> >>> +% CYRILLIC CAPITAL LETTER ZE >>> +<U0417> <U005A> >>> +% CYRILLIC CAPITAL LETTER I >>> +<U0418> <U0049> >>> +% CYRILLIC CAPITAL LETTER SHORT I >>> +<U0419> <U004A> >>> +% CYRILLIC CAPITAL LETTER KA >>> +<U041A> <U004B> >>> +% CYRILLIC CAPITAL LETTER EL >>> +<U041B> <U004C> >>> +% CYRILLIC CAPITAL LETTER EM >>> +<U041C> <U004D> >>> +% CYRILLIC CAPITAL LETTER EN >>> +<U041D> <U004E> >>> +% CYRILLIC CAPITAL LETTER O >>> +<U041E> <U004F> >>> +% CYRILLIC CAPITAL LETTER PE >>> +<U041F> <U0050> >>> +% CYRILLIC CAPITAL LETTER ER >>> +<U0420> <U0052> >>> +% CYRILLIC CAPITAL LETTER ES >>> +<U0421> <U0053> >>> +% CYRILLIC CAPITAL LETTER TE >>> +<U0422> <U0054> >>> +% CYRILLIC CAPITAL LETTER U >>> +<U0423> <U0055> >>> +% CYRILLIC CAPITAL LETTER EF >>> +<U0424> <U0046> >>> +% CYRILLIC CAPITAL LETTER HA >>> +<U0425> <U0058> >>> +% CYRILLIC CAPITAL LETTER TSE >>> +<U0426> "<U0043><U005A>";<U0043> >>> +% CYRILLIC CAPITAL LETTER CHE >>> +<U0427> "<U0043><U0048>";<U0043> >>> +% CYRILLIC CAPITAL LETTER SHA >>> +<U0428> "<U0053><U0048>";<U0053> >>> +% CYRILLIC CAPITAL LETTER SHCHA >>> +<U0429> "<U0053><U0048><U0048>";<U0053> >>> +% CYRILLIC CAPITAL LETTER HARD SIGN >>> +<U042A> "<U0060><U0060>";<U0060> >>> +% CYRILLIC CAPITAL LETTER YERU >>> +<U042B> "<U0059><U0027>";<U0059> >>> +% CYRILLIC CAPITAL LETTER SOFT SIGN >>> +<U042C> <U0060> >>> +% CYRILLIC CAPITAL LETTER E >>> +<U042D> "<U0045><U0060>";<U0045> >>> +% CYRILLIC CAPITAL LETTER YU >>> +<U042E> "<U0059><U0055>";<U0059> >>> +% CYRILLIC CAPITAL LETTER YA >>> +<U042F> "<U0059><U0041>";<U0059> >>> +% CYRILLIC SMALL LETTER A >>> +<U0430> <U0061> >>> +% CYRILLIC SMALL LETTER BE >>> +<U0431> <U0062> >>> +% CYRILLIC SMALL LETTER VE >>> +<U0432> <U0076> >>> +% CYRILLIC SMALL LETTER GHE >>> +<U0433> <U0067> >>> +% CYRILLIC SMALL LETTER DE >>> +<U0434> <U0064> >>> +% CYRILLIC SMALL LETTER IE >>> +<U0435> <U0065> >>> +% CYRILLIC SMALL LETTER ZHE >>> +<U0436> "<U007A><U0068>";<U007A> >>> +% CYRILLIC SMALL LETTER ZE >>> +<U0437> <U007A> >>> +% CYRILLIC SMALL LETTER I >>> +<U0438> <U0069> >>> +% CYRILLIC SMALL LETTER SHORT I >>> +<U0439> <U006A> >>> +% CYRILLIC SMALL LETTER KA >>> +<U043A> <U006B> >>> +% CYRILLIC SMALL LETTER EL >>> +<U043B> <U006C> >>> +% CYRILLIC SMALL LETTER EM >>> +<U043C> <U006D> >>> +% CYRILLIC SMALL LETTER EN >>> +<U043D> <U006E> >>> +% CYRILLIC SMALL LETTER O >>> +<U043E> <U006F> >>> +% CYRILLIC SMALL LETTER PE >>> +<U043F> <U0070> >>> +% CYRILLIC SMALL LETTER ER >>> +<U0440> <U0072> >>> +% CYRILLIC SMALL LETTER ES >>> +<U0441> <U0073> >>> +% CYRILLIC SMALL LETTER TE >>> +<U0442> <U0074> >>> +% CYRILLIC SMALL LETTER U >>> +<U0443> <U0075> >>> +% CYRILLIC SMALL LETTER EF >>> +<U0444> <U0066> >>> +% CYRILLIC SMALL LETTER HA >>> +<U0445> <U0078> >>> +% CYRILLIC SMALL LETTER TSE >>> +<U0446> "<U0063><U007A>";<U0063> >>> +% CYRILLIC SMALL LETTER CHE >>> +<U0447> "<U0063><U0068>";<U0063> >>> +% CYRILLIC SMALL LETTER SHA >>> +<U0448> "<U0073><U0068>";<U0073> >>> +% CYRILLIC SMALL LETTER SHCHA >>> +<U0449> "<U0073><U0068><U0068>";<U0073> >>> +% CYRILLIC SMALL LETTER HARD SIGN >>> +<U044A> "<U0060><U0060>";<U0060> >>> +% CYRILLIC SMALL LETTER YERU >>> +<U044B> "<U0079><U0027>";<U0079> >>> +% CYRILLIC SMALL LETTER SOFT SIGN >>> +<U044C> <U0060> >>> +% CYRILLIC SMALL LETTER E >>> +<U044D> "<U0065><U0060>";<U0065> >>> +% CYRILLIC SMALL LETTER YU >>> +<U044E> "<U0079><U0075>";<U0079> >>> +% CYRILLIC SMALL LETTER YA >>> +<U044F> "<U0079><U0061>";<U0079> >>> +% CYRILLIC SMALL LETTER IO >>> +<U0451> "<U0079><U006F>";<U0079> >>> + >>> + >>> +translit_end >>> + >>> +END LC_CTYPE >>> diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA >>> --- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000 >>> @@ -64,6 +64,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US >>> --- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000 >>> @@ -48,6 +48,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN >>> --- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000 >>> @@ -46,6 +46,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK >>> --- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000 >>> @@ -58,6 +58,7 @@ >>> % Farsi yeh -> yeh >>> <U06CC> "<U064A>" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA >>> --- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000 >>> @@ -67,6 +67,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN >>> --- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000 >>> @@ -58,6 +58,7 @@ >>> % dong sign -> d// -> dd >>> <U20AB> "<U0111>";"<U0064><U0064>" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE >>> --- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000 >>> @@ -69,6 +69,7 @@ >>> <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>" >>> <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN >>> --- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000 >>> @@ -55,6 +55,7 @@ >>> % Accents are simply omitted if they cannot be represented. >>> include "translit_combining";"" >>> >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA >>> --- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000 >>> @@ -66,6 +66,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US >>> --- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000 >>> @@ -73,6 +73,7 @@ >>> <U05F0> "<U05D5><U05D5>";"<U0077><U0077>" >>> <U05F1> "<U05D5><U05D9>";"<U0077><U006A>" >>> <U05F2> "<U05D9><U05D9>";"<U006A><U006A>" >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> END LC_CTYPE >>> diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN >>> --- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000 >>> +++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000 >>> @@ -58,6 +58,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> >>> class "hanzi"; / >>> diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA >>> --- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000 >>> +++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000 >>> @@ -70,6 +70,7 @@ >>> >>> translit_start >>> include "translit_combining";"" >>> +include "translit_cyrillic";"" >>> translit_end >>> END LC_CTYPE >>> >>> >>>
Hi Egor, Thanks for your patience with this one. On 2018-10-03 12:32, Egor Kobylkin wrote: > On 03.10.2018 11:19, Keld Simonsen wrote: >> >> Please note that translitteration of Cyrillic to latin is not universal. >> There are different schemes for for example German, English and Danish, and >> there is also an ISO standard for it. > > Thanks for your feedback, Keld! > > Could the locale maintainers that wouldn't like to include this patch > explicitly state so here? > > That is: > - In the case that there is a different preferred cyrillic > transliteration table for any specific locale their maintainers may want > to point me to it so I can supply a separate table/patch. > - Or they could state explicitly that for some reason they would like to > exclude their locale from the patch for a default cyrillic > transliteration altogether. The Wikipedia article https://en.wikipedia.org/wiki/ISO_9 helps to understand that ISO 9:1995 and GOST 7.79-2000 System A are identical so perhaps you could mention both ISO 9 and the Wikipedia article in the commit log. translit_cyrillic includes every transliteration defined in ISO 9:1995 and GOST 7.79-2000, correct? I think those locales which already have Cyrillic transliteration defined it would be best to leave them as-is (as you've done) unless there are some issues with them, there's probably a good reason why they have been added in the first place. For other locales, using ISO 9 instead of not doing transliteration at all may not be entirely correct but I'd suppose it's better to provide at least some sort of transliteration (even if not entirely correct) than sequences of question marks. But as you say, locale maintainers may know better the case for individual locales. Wrt language-specific differences Keld mentioned, Finnish Wikipedia article on transliteration gives an example, see the table on right at https://fi.wikipedia.org/wiki/Siirtokirjoitus for Russian / international / Finnish / Swedish / English / French / German / Polish / phonetic transliteration of a Russian name. (The table also shows that for correct transliteration ASCII letters are not enough for some languages.) Some of the differences and language-specific aspects are probably impossible to take fully into account within the locale system we have today. For example, in Finnish (the tables at http://jkorpela.fi/iso9.html8 and https://fi.wikipedia.org/wiki/Ven%C3%A4j%C3%A4n_translitterointi might also be helpful): 1) transliteration of Russian is mostly as per ISO 9 but with national differences defined in SFS 4900 2) transliteration of Russian and Ukrainian names have some slight differences according to http://jkorpela.fi/iso9.html8 3) transliteration of a letter depends on its position within a word or pronunciation of adjacent letters, for example U+0435 becomes U+0065 (e) except when at the beginning of a word it becomes U+006A U+0065 (je) Hopefully we'll hear comments from others as well. Once your patch is merged, I'll try to come up with the needed locale-specific changes for fi_FI, some differences referred to in 1) above are straightforward to implement but for 2) and 3) some compromises probably need to be made, unfortunately. Thanks, >> On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote: >>> Ping. >>> >>> Absent of feedback I am wondering if anything could be missing in this >>> patch from the maintainers standpoint. More than two months have passed >>> since the original submission. >>> >>> If I can be of assistance, please do not hesitate to contact me, >>> Egor Kobylkin >>> >>> On 06.08.2018 21:00, Egor Kobylkin wrote: >>>> Dear locale maintainers, >>>> >>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails" >>>> >>>> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1] >>>> >>>> add Cyrillic transliteration table translit_cyrillic file >>>> >>>> https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7] >>>> >>>> to localedata/locales/ and include it in all your locales going forward. >>>> >>>> Patch included inline below. >>>> >>>> This is a re-submission for the consideration for 2.29 on a request from >>>> Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html >>>> >>>> From this patch I have excluded locales that already mention cyrillic or >>>> have a transliteration table for it: >>>> az_AZ >>>> iso14651_t1_common >>>> ky_KG >>>> mn_MN >>>> sr_RS >>>> tg_TJ >>>> tk_TM >>>> tt_RU >>>> uk_UA >>>> uz_UZ >>>> uz_UZ@cyrillic >>>> >>>> Their maintainers are requested to make an explicit decision on how and >>>> whether at all to include this patch. >>>> >>>> >>>> >>>> Current bug effect: >>>> >>>> The glibc wiki explicitly lists this use case as the test example >>>> >>>> https://sourceware.org/glibc/wiki/Locales#Testing_Locales : >>>> >>>> LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < >>>> translit-test-input.txt >>>> >>>> currently it fails on Cyrillic texts in most locales including ru_RU [1] >>>> [8] [9]: >>>> >>>> LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < >>>> translit-test-input.txt |grep CYRILLIC >>>> >>>> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???. >>>> >>>> - It produces a string of question marks and spaces. >>>> >>>> This is what it should produce and it does so after the patch applied: >>>> >>>> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe >>>> chayu. >>>> >>>> >>>> Root problem and the fix: >>>> >>>> The root problem is the missing transliteration table that I am >>>> supplying here. Furthermore it has to be referenced/included into the >>>> active locale at the compilation time to be used by iconv. >>>> >>>> >>>> >>>> COMMIT MESSAGE: >>>> This translit_cyrillic table enables conversion (e.g. with iconv) from a >>>> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text. >>>> >>>> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of >>>> a transliteration has only ASCII codes but still can be read by a native >>>> speaker. Among other things it is useful for processing the Cyrillic >>>> texts and filenames by programs or on systems that are not specifically >>>> prepared to work with Cyrillic, don't have corresponding fonts installed >>>> or can't handle UTF-8. >>>> >>>> The transliteration table itself is attached as a file translit_cyrillic >>>> [7]. Its content (mapping) is based on GOST 7.79-2000 official source >>>> (Federal Agency on Technical Regulating and Metrology Of Russian >>>> Federation [2]). Technically an independent but identical source [3] was >>>> used and prepared in a spreadsheet [6]. >>>> >>>> The documentation suggests that the transliteration tables inclusion is >>>> done by adding *include "translit_cyrillic";""* string into LC_CTYPE >>>> translit_start section >>>> http://man7.org/linux/man-pages/man5/locale.5.html [5] >>>> Practically I have searched for all locales that have a >>>> translit_start/end stance and generated a patch for them. >>>> >>>> The Cyrillic transliteration of e.g. Russian text may have already >>>> worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that >>>> have their transliteration tables included inline. >>>> However it would not be the standard Russian Cyrillic transliteration as >>>> described above. >>>> I am excluding these locales from this proposed patch. I have written >>>> directly to locale maintainer emails listed in the files. Volodymyr >>>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA), >>>> ???????????? ?????????? <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the >>>> exclusion. >>>> >>>> Links: >>>> >>>> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872 >>>> [2] GOST 7.79-2000 official source >>>> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only >>>> available in low quality gif format) >>>> [3] http://transliteration.ru/gost-7-79-2000/ and >>>> http://www.yfermer.ru/specifications/285821.html >>>> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet >>>> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9 >>>> [5] http://man7.org/linux/man-pages/man5/locale.5.html >>>> [6] Spreadsheet for generating translit_cyrillic >>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 >>>> [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591 >>>> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales >>>> [9] translit-test-input.txt >>>> https://sourceware.org/bugzilla/attachment.cgi?id=8618 >>>> >>>> Best regards, >>>> Egor Kobylkin >>>> >>>> --- >>>> 2018-07-17 Egor Kobylkin <egor@kobylkin.com> >>>> >>>> [BZ #2872] >>>> * locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration >>>> table from Cyrillic to Latin. >>>> * locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit >>>> section. >>>> * locales/aa_DJ: likewise >>>> * locales/af_ZA: likewise >>>> * locales/ak_GH: likewise >>>> * locales/am_ET: likewise >>>> * locales/ar_EG: likewise >>>> * locales/be_BY: likewise >>>> * locales/bem_ZM: likewise >>>> * locales/ber_DZ: likewise >>>> * locales/ber_MA: likewise >>>> * locales/bg_BG: likewise >>>> * locales/bi_VU: likewise >>>> * locales/bn_BD: likewise >>>> * locales/bo_CN: likewise >>>> * locales/ca_ES: likewise >>>> * locales/ce_RU: likewise >>>> * locales/cs_CZ: likewise >>>> * locales/cv_RU: likewise >>>> * locales/cy_GB: likewise >>>> * locales/da_DK: likewise >>>> * locales/de_DE: likewise >>>> * locales/dv_MV: likewise >>>> * locales/dz_BT: likewise >>>> * locales/el_GR: likewise >>>> * locales/en_GB: likewise >>>> * locales/en_NG: likewise >>>> * locales/en_ZM: likewise >>>> * locales/es_CU: likewise >>>> * locales/es_ES: likewise >>>> * locales/et_EE: likewise >>>> * locales/fa_IR: likewise >>>> * locales/ff_SN: likewise >>>> * locales/fi_FI: likewise >>>> * locales/fr_FR: likewise >>>> * locales/ga_IE: likewise >>>> * locales/gd_GB: likewise >>>> * locales/gu_IN: likewise >>>> * locales/gv_GB: likewise >>>> * locales/he_IL: likewise >>>> * locales/hi_IN: likewise >>>> * locales/hif_FJ: likewise >>>> * locales/hr_HR: likewise >>>> * locales/ht_HT: likewise >>>> * locales/hu_HU: likewise >>>> * locales/hy_AM: likewise >>>> * locales/id_ID: likewise >>>> * locales/is_IS: likewise >>>> * locales/it_IT: likewise >>>> * locales/ja_JP: likewise >>>> * locales/kk_KZ: likewise >>>> * locales/km_KH: likewise >>>> * locales/kn_IN: likewise >>>> * locales/ko_KR: likewise >>>> * locales/ks_IN: likewise >>>> * locales/kw_GB: likewise >>>> * locales/lb_LU: likewise >>>> * locales/lg_UG: likewise >>>> * locales/lij_IT: likewise >>>> * locales/ln_CD: likewise >>>> * locales/lo_LA: likewise >>>> * locales/lt_LT: likewise >>>> * locales/lv_LV: likewise >>>> * locales/mg_MG: likewise >>>> * locales/mhr_RU: likewise >>>> * locales/mk_MK: likewise >>>> * locales/ml_IN: likewise >>>> * locales/ms_MY: likewise >>>> * locales/mt_MT: likewise >>>> * locales/nan_TW@latin: likewise >>>> * locales/nb_NO: likewise >>>> * locales/ne_NP: likewise >>>> * locales/nhn_MX: likewise >>>> * locales/niu_NU: likewise >>>> * locales/niu_NZ: likewise >>>> * locales/nl_NL: likewise >>>> * locales/nr_ZA: likewise >>>> * locales/oc_FR: likewise >>>> * locales/om_KE: likewise >>>> * locales/or_IN: likewise >>>> * locales/os_RU: likewise >>>> * locales/pa_IN: likewise >>>> * locales/pa_PK: likewise >>>> * locales/pl_PL: likewise >>>> * locales/pt_PT: likewise >>>> * locales/quz_PE: likewise >>>> * locales/ro_RO: likewise >>>> * locales/ru_RU: likewise >>>> * locales/rw_RW: likewise >>>> * locales/sa_IN: likewise >>>> * locales/sd_IN: likewise >>>> * locales/sd_IN@devanagari: likewise >>>> * locales/sd_PK: likewise >>>> * locales/se_NO: likewise >>>> * locales/sgs_LT: likewise >>>> * locales/si_LK: likewise >>>> * locales/sk_SK: likewise >>>> * locales/sl_SI: likewise >>>> * locales/sm_WS: likewise >>>> * locales/so_SO: likewise >>>> * locales/sq_AL: likewise >>>> * locales/ss_ZA: likewise >>>> * locales/st_ZA: likewise >>>> * locales/sv_SE: likewise >>>> * locales/sw_KE: likewise >>>> * locales/ta_IN: likewise >>>> * locales/te_IN: likewise >>>> * locales/th_TH: likewise >>>> * locales/ti_ET: likewise >>>> * locales/tn_ZA: likewise >>>> * locales/to_TO: likewise >>>> * locales/tpi_PG: likewise >>>> * locales/tr_TR: likewise >>>> * locales/ts_ZA: likewise >>>> * locales/unm_US: likewise >>>> * locales/ur_IN: likewise >>>> * locales/ur_PK: likewise >>>> * locales/ve_ZA: likewise >>>> * locales/vi_VN: likewise >>>> * locales/wa_BE: likewise >>>> * locales/wo_SN: likewise >>>> * locales/xh_ZA: likewise >>>> * locales/yi_US: likewise >>>> * locales/zh_CN: likewise >>>> * locales/zu_ZA: likewise >>>> >>>> >>>> diff -uNr a/localedata/locales/C b/localedata/locales/C >>>> --- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000 >>>> @@ -2292,6 +2292,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ >>>> --- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000 >>>> +++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000 >>>> @@ -70,6 +70,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA >>>> --- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000 >>>> +++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000 >>>> @@ -72,6 +72,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH >>>> --- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000 >>>> +++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000 >>>> @@ -56,6 +56,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET >>>> --- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000 >>>> +++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000 >>>> @@ -1396,6 +1396,7 @@ >>>> <U137A> <U0060><U0039><U0030> >>>> <U137B> <U0060><U0031><U0030><U0030> >>>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> % >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG >>>> --- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000 >>>> +++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -44,6 +44,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY >>>> --- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -69,6 +69,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM >>>> --- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -42,6 +42,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ >>>> --- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -166,6 +166,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA >>>> --- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -86,6 +86,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG >>>> --- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -49,6 +49,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU >>>> --- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -39,6 +39,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD >>>> --- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -63,6 +63,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN >>>> --- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -43,6 +43,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES >>>> --- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -72,6 +72,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU >>>> --- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -39,6 +39,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ >>>> --- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -2311,6 +2311,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU >>>> --- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -109,6 +109,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB >>>> --- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -69,6 +69,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK >>>> --- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -167,6 +167,7 @@ >>>> % LATIN SMALL LETTER O WITH STROKE -> "oe" >>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE >>>> --- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -78,6 +78,7 @@ >>>> % DOUBLE HIGH-REVERSED-9 QUOTATION MARK >>>> <U201F> <U00AB>;<U0022> >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV >>>> --- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -52,6 +52,7 @@ >>>> include "translit_combining";"" >>>> >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT >>>> --- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR >>>> --- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB >>>> --- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -55,6 +55,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG >>>> --- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -50,6 +50,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM >>>> --- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -42,6 +42,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU >>>> --- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES >>>> --- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -73,6 +73,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE >>>> --- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -109,6 +109,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR >>>> --- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -79,6 +79,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN >>>> --- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -42,6 +42,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI >>>> --- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -137,6 +137,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR >>>> --- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> % In France, accents are simply omitted if they cannot be represented. >>>> include "translit_combining";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE >>>> --- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -54,6 +54,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB >>>> --- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -47,6 +47,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN >>>> --- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -62,6 +62,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB >>>> --- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -57,6 +57,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL >>>> --- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN >>>> --- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -61,6 +61,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ >>>> --- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -37,6 +37,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR >>>> --- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -153,6 +153,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT >>>> --- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU >>>> --- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -478,6 +478,7 @@ >>>> <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>" >>>> <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM >>>> --- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -77,6 +77,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID >>>> --- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -55,6 +55,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS >>>> --- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -2161,6 +2161,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT >>>> --- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP >>>> --- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -1682,6 +1682,7 @@ >>>> include "translit_combining";"" >>>> include "translit_cjk_variants";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ >>>> --- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -158,6 +158,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH >>>> --- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -873,6 +873,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN >>>> --- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -63,6 +63,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR >>>> --- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -6099,6 +6099,7 @@ >>>> include "translit_combining";"" >>>> include "translit_hangul";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN >>>> --- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -46,6 +46,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB >>>> --- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU >>>> --- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -78,6 +78,7 @@ >>>> % LATIN SMALL LETTER E WITH CIRCUMFLEX >>>> <U00EA> "<U0065><U005E>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG >>>> --- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -57,6 +57,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT >>>> --- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -47,6 +47,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD >>>> --- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -39,6 +39,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA >>>> --- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -51,6 +51,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT >>>> --- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -77,6 +77,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV >>>> --- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -2122,6 +2122,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG >>>> --- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -55,6 +55,7 @@ >>>> % Accents are simply omitted if they cannot be represented. >>>> include "translit_combining";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU >>>> --- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK >>>> --- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -49,6 +49,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN >>>> --- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> % >>>> diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY >>>> --- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -45,6 +45,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT >>>> --- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -47,6 +47,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/nan_TW@latin >>>> b/localedata/locales/nan_TW@latin >>>> --- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -53,6 +53,7 @@ >>>> % accents are simply omitted if they cannot be represented. >>>> include "translit_combining";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO >>>> --- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -154,6 +154,7 @@ >>>> % LATIN SMALL LETTER O WITH STROKE -> "oe" >>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP >>>> --- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -43,6 +43,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX >>>> --- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU >>>> --- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ >>>> --- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL >>>> --- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -57,6 +57,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA >>>> --- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -66,6 +66,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR >>>> --- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -62,6 +62,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE >>>> --- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -140,6 +140,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN >>>> --- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -62,6 +62,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU >>>> --- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -70,6 +70,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN >>>> --- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK >>>> --- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> % Farsi yeh -> yeh >>>> <U06CC> "<U064A>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL >>>> --- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -142,6 +142,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT >>>> --- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE >>>> --- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -57,6 +57,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO >>>> --- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -144,6 +144,7 @@ >>>> <U0162> "<U021A>";"<U0054>" >>>> <U0163> "<U021B>";"<U0074>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU >>>> --- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -74,6 +74,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW >>>> --- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -45,6 +45,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN >>>> --- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -44,6 +44,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN >>>> --- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -46,6 +46,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sd_IN@devanagari >>>> b/localedata/locales/sd_IN@devanagari >>>> --- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000 >>>> +0000 >>>> +++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000 >>>> +0000 >>>> @@ -44,6 +44,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK >>>> --- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -39,6 +39,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO >>>> --- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -205,6 +205,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT >>>> --- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK >>>> --- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -45,6 +45,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK >>>> --- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -68,6 +68,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI >>>> --- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -91,6 +91,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS >>>> --- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -37,6 +37,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO >>>> --- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -70,6 +70,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL >>>> --- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -45,6 +45,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA >>>> --- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -68,6 +68,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA >>>> --- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -64,6 +64,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE >>>> --- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -139,6 +139,7 @@ >>>> % LATIN SMALL LETTER O WITH STROKE -> "oe" >>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE >>>> --- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -44,6 +44,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN >>>> --- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -63,6 +63,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN >>>> --- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -63,6 +63,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH >>>> --- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET >>>> --- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -866,6 +866,7 @@ >>>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> >>>> >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> % >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA >>>> --- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -69,6 +69,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO >>>> --- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -36,6 +36,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG >>>> --- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -37,6 +37,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR >>>> --- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -2430,6 +2430,7 @@ >>>> >>>> % TURKISH LIRA SIGN >>>> <U20BA> "<U0054><U004C>" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/translit_cyrillic >>>> b/localedata/locales/translit_cyrillic >>>> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 >>>> +0000 >>>> +++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000 >>>> +0000 >>>> @@ -0,0 +1,151 @@ >>>> +escape_char / >>>> +comment_char % >>>> + >>>> +% Transliterations that converts cyrillic letters to ascii symbols >>>> inspired by GOST 7.79-2000 >>>> +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872 >>>> +% Generated from UnicodeData.txt with >>>> +% https://sourceware.org/bugzilla/attachment.cgi?id=8590 >>>> +% Up to three characters are required to do a reversible transliteration. >>>> + >>>> +LC_CTYPE >>>> + >>>> +translit_start >>>> + >>>> + >>>> +% CYRILLIC CAPITAL LETTER IO >>>> +<U0401> "<U0059><U004F>";<U0059> >>>> +% CYRILLIC CAPITAL LETTER A >>>> +<U0410> <U0041> >>>> +% CYRILLIC CAPITAL LETTER BE >>>> +<U0411> <U0042> >>>> +% CYRILLIC CAPITAL LETTER VE >>>> +<U0412> <U0056> >>>> +% CYRILLIC CAPITAL LETTER GHE >>>> +<U0413> <U0047> >>>> +% CYRILLIC CAPITAL LETTER DE >>>> +<U0414> <U0044> >>>> +% CYRILLIC CAPITAL LETTER IE >>>> +<U0415> <U0045> >>>> +% CYRILLIC CAPITAL LETTER ZHE >>>> +<U0416> "<U005A><U0048>";<U005A> >>>> +% CYRILLIC CAPITAL LETTER ZE >>>> +<U0417> <U005A> >>>> +% CYRILLIC CAPITAL LETTER I >>>> +<U0418> <U0049> >>>> +% CYRILLIC CAPITAL LETTER SHORT I >>>> +<U0419> <U004A> >>>> +% CYRILLIC CAPITAL LETTER KA >>>> +<U041A> <U004B> >>>> +% CYRILLIC CAPITAL LETTER EL >>>> +<U041B> <U004C> >>>> +% CYRILLIC CAPITAL LETTER EM >>>> +<U041C> <U004D> >>>> +% CYRILLIC CAPITAL LETTER EN >>>> +<U041D> <U004E> >>>> +% CYRILLIC CAPITAL LETTER O >>>> +<U041E> <U004F> >>>> +% CYRILLIC CAPITAL LETTER PE >>>> +<U041F> <U0050> >>>> +% CYRILLIC CAPITAL LETTER ER >>>> +<U0420> <U0052> >>>> +% CYRILLIC CAPITAL LETTER ES >>>> +<U0421> <U0053> >>>> +% CYRILLIC CAPITAL LETTER TE >>>> +<U0422> <U0054> >>>> +% CYRILLIC CAPITAL LETTER U >>>> +<U0423> <U0055> >>>> +% CYRILLIC CAPITAL LETTER EF >>>> +<U0424> <U0046> >>>> +% CYRILLIC CAPITAL LETTER HA >>>> +<U0425> <U0058> >>>> +% CYRILLIC CAPITAL LETTER TSE >>>> +<U0426> "<U0043><U005A>";<U0043> >>>> +% CYRILLIC CAPITAL LETTER CHE >>>> +<U0427> "<U0043><U0048>";<U0043> >>>> +% CYRILLIC CAPITAL LETTER SHA >>>> +<U0428> "<U0053><U0048>";<U0053> >>>> +% CYRILLIC CAPITAL LETTER SHCHA >>>> +<U0429> "<U0053><U0048><U0048>";<U0053> >>>> +% CYRILLIC CAPITAL LETTER HARD SIGN >>>> +<U042A> "<U0060><U0060>";<U0060> >>>> +% CYRILLIC CAPITAL LETTER YERU >>>> +<U042B> "<U0059><U0027>";<U0059> >>>> +% CYRILLIC CAPITAL LETTER SOFT SIGN >>>> +<U042C> <U0060> >>>> +% CYRILLIC CAPITAL LETTER E >>>> +<U042D> "<U0045><U0060>";<U0045> >>>> +% CYRILLIC CAPITAL LETTER YU >>>> +<U042E> "<U0059><U0055>";<U0059> >>>> +% CYRILLIC CAPITAL LETTER YA >>>> +<U042F> "<U0059><U0041>";<U0059> >>>> +% CYRILLIC SMALL LETTER A >>>> +<U0430> <U0061> >>>> +% CYRILLIC SMALL LETTER BE >>>> +<U0431> <U0062> >>>> +% CYRILLIC SMALL LETTER VE >>>> +<U0432> <U0076> >>>> +% CYRILLIC SMALL LETTER GHE >>>> +<U0433> <U0067> >>>> +% CYRILLIC SMALL LETTER DE >>>> +<U0434> <U0064> >>>> +% CYRILLIC SMALL LETTER IE >>>> +<U0435> <U0065> >>>> +% CYRILLIC SMALL LETTER ZHE >>>> +<U0436> "<U007A><U0068>";<U007A> >>>> +% CYRILLIC SMALL LETTER ZE >>>> +<U0437> <U007A> >>>> +% CYRILLIC SMALL LETTER I >>>> +<U0438> <U0069> >>>> +% CYRILLIC SMALL LETTER SHORT I >>>> +<U0439> <U006A> >>>> +% CYRILLIC SMALL LETTER KA >>>> +<U043A> <U006B> >>>> +% CYRILLIC SMALL LETTER EL >>>> +<U043B> <U006C> >>>> +% CYRILLIC SMALL LETTER EM >>>> +<U043C> <U006D> >>>> +% CYRILLIC SMALL LETTER EN >>>> +<U043D> <U006E> >>>> +% CYRILLIC SMALL LETTER O >>>> +<U043E> <U006F> >>>> +% CYRILLIC SMALL LETTER PE >>>> +<U043F> <U0070> >>>> +% CYRILLIC SMALL LETTER ER >>>> +<U0440> <U0072> >>>> +% CYRILLIC SMALL LETTER ES >>>> +<U0441> <U0073> >>>> +% CYRILLIC SMALL LETTER TE >>>> +<U0442> <U0074> >>>> +% CYRILLIC SMALL LETTER U >>>> +<U0443> <U0075> >>>> +% CYRILLIC SMALL LETTER EF >>>> +<U0444> <U0066> >>>> +% CYRILLIC SMALL LETTER HA >>>> +<U0445> <U0078> >>>> +% CYRILLIC SMALL LETTER TSE >>>> +<U0446> "<U0063><U007A>";<U0063> >>>> +% CYRILLIC SMALL LETTER CHE >>>> +<U0447> "<U0063><U0068>";<U0063> >>>> +% CYRILLIC SMALL LETTER SHA >>>> +<U0448> "<U0073><U0068>";<U0073> >>>> +% CYRILLIC SMALL LETTER SHCHA >>>> +<U0449> "<U0073><U0068><U0068>";<U0073> >>>> +% CYRILLIC SMALL LETTER HARD SIGN >>>> +<U044A> "<U0060><U0060>";<U0060> >>>> +% CYRILLIC SMALL LETTER YERU >>>> +<U044B> "<U0079><U0027>";<U0079> >>>> +% CYRILLIC SMALL LETTER SOFT SIGN >>>> +<U044C> <U0060> >>>> +% CYRILLIC SMALL LETTER E >>>> +<U044D> "<U0065><U0060>";<U0065> >>>> +% CYRILLIC SMALL LETTER YU >>>> +<U044E> "<U0079><U0075>";<U0079> >>>> +% CYRILLIC SMALL LETTER YA >>>> +<U044F> "<U0079><U0061>";<U0079> >>>> +% CYRILLIC SMALL LETTER IO >>>> +<U0451> "<U0079><U006F>";<U0079> >>>> + >>>> + >>>> +translit_end >>>> + >>>> +END LC_CTYPE >>>> diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA >>>> --- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -64,6 +64,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US >>>> --- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -48,6 +48,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN >>>> --- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -46,6 +46,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK >>>> --- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> % Farsi yeh -> yeh >>>> <U06CC> "<U064A>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA >>>> --- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -67,6 +67,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN >>>> --- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> % dong sign -> d// -> dd >>>> <U20AB> "<U0111>";"<U0064><U0064>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE >>>> --- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -69,6 +69,7 @@ >>>> <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>" >>>> <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN >>>> --- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -55,6 +55,7 @@ >>>> % Accents are simply omitted if they cannot be represented. >>>> include "translit_combining";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA >>>> --- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -66,6 +66,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US >>>> --- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -73,6 +73,7 @@ >>>> <U05F0> "<U05D5><U05D5>";"<U0077><U0077>" >>>> <U05F1> "<U05D5><U05D9>";"<U0077><U006A>" >>>> <U05F2> "<U05D9><U05D9>";"<U006A><U006A>" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN >>>> --- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> class "hanzi"; / >>>> diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA >>>> --- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000 >>>> +++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -70,6 +70,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> >>>> >
3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote: > > On 03.10.2018 11:19, Keld Simonsen wrote: > > Hi > > > > Please note that translitteration of Cyrillic to latin is not universal. > > There are different schemes for for example German, English and Danish, and > > there is also an ISO standard for it. > > Thanks for your feedback, Keld! > > Could the locale maintainers that wouldn't like to include this patch > explicitly state so here? I think it is about me so I must reply. I am sorry about that and the sole reason is my lack of time. I'm just a volunteer here, that means it's not my regular job to work on locale data nor anything in glibc nor in any other open source project. I do these things only in my free time which I don't have much. Of course you will see my contributions here and there but they are either trivial or take me months to complete. Your patches are on my radar but I can't tell any ETA for them. Of course, there are other people around here and they are all welcome to come and join. > That is: > - In the case that there is a different preferred cyrillic > transliteration table for any specific locale their maintainers may want > to point me to it so I can supply a separate table/patch. > - Or they could state explicitly that for some reason they would like to > exclude their locale from the patch for a default cyrillic > transliteration altogether. As Keld wrote, there are probably separate rules for every language so I don't think you should treat your rules as universal and include them in every locale. At first sight, it seems to me they work only for English (as a destination locale). Also, although it is called "transliteration from Cyrillic" it seems that it covers only Russian alphabet. What about other languages which use Cyrillic alphabet but add their own diacritic characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut, Tatar, and more. What about languages which use Cyrillic alphabet but transliterate their respective letters in a different way than Russian? For example, Russian "Ъ" is (I think) usually skipped in transliteration, I think you propose "``", but when transliterating from Bulgarian they usually transliterate this as "ă". Few remarks: * I think you transliterate "щ" as "shh", wouldn't "shch" be better? * You transliterate "ц" as "cz", wouldn't "ts" be better? By the way, in Polish language "cz" is a correct transliteration of "ч". * You transliterate "й" as "j", this is fine in many languages but wouldn't "y" be better in English? * In case of "е": how will you know if it is correct to transliterate it to "e" or "ie" or "je" or "ye"? These remarks are obviously incomplete, your patch deserves much more attention to review. Best regards, Rafal
removed a png image attachment Keld,Marko,Rafal, other locale maintainers, this all is written with having in mind a minimal viable fix for this bug asap. I want to avoid wasting maintainers time getting into fundamental discussions here (although for perfectly good reasons). I see three options: 1. those locale maintainers that are fine with using ISO 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it in their locales. https://sourceware.org/bugzilla/attachment.cgi?id=11289 2. those that that want to have a differing table can create their own variety based on the spreadsheet I have prepared https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in this patch. 3. those that want to omit a cyrillic transliteration altogether for now state so and just carry over the bug #2872 from the year 2006. Does this make sense to you? Just to be super clear on this: the patch is a stopgap _ASCII_ transliteration table. ASCII being AMERICAN Standard Code for Information Interchange, that is obviously orthogonal to any transliteration rule of other countries. As such it is not explicitly targeting transliteration standards of any country. The fact that the patch is reflecting Russian variety of ISO 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is available and can be helpful to a majority of cyrillic users b) I have access to it including via being proficient in Russian. It is offered to all the respective locale maintainers as a stopgap solution. Stopgap in the sense that it is better to have some transliteration than not to have any at all and carry over the bug from 2006. That it may be a somewhat officially correct transliteration for ru_RU is a bonus. In that sense I would dub the discussion on the correctness for other languages "offtopic". Let me know if this is not OK. You are all are correctly mentioning the deficiencies of this approach. However, I couldn't find a better straightforward approach as of yet. Happy to hear from you as on how this could be handled. There is a danger of being caught in the web of language/country differences. I propose just pruning the locales that are not comfortable including this current table. We can address possible solutions in the second wave of patching. I am vary of getting into discussions on specific country variants just because of the sheer complexity of this topic. It is probably better addressed by respective maintainers of their locales. I do not see a "one fits all" solution in this first wave possible. I would like to have this "three options plan of action" vetted first and then we could go to the specific detail. (Like, for instance, what characters should be included in to the table, and in which transliteration form.) I am looking forward to your reply, Egor Kobylkin P.S. specifically as to how address languages other than Ru included in GOST_7.79_System_B: we can take the first option left to right from that table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those locales/languages but with errors where Ru supersedes their own variants. On 05.10.2018 11:20, Rafal Luzynski wrote: > 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote: >> >> On 03.10.2018 11:19, Keld Simonsen wrote: >>> Hi >>> >>> Please note that translitteration of Cyrillic to latin is not universal. >>> There are different schemes for for example German, English and Danish, and >>> there is also an ISO standard for it. >> >> Thanks for your feedback, Keld! >> >> Could the locale maintainers that wouldn't like to include this patch >> explicitly state so here? > > I think it is about me so I must reply. I am sorry about that and the sole > reason is my lack of time. I'm just a volunteer here, that means it's not > my regular job to work on locale data nor anything in glibc nor in any other > open source project. I do these things only in my free time which I don't > have much. Of course you will see my contributions here and there but they > are either trivial or take me months to complete. Your patches are on my > radar but I can't tell any ETA for them. Of course, there are other people > around here and they are all welcome to come and join. > >> That is: >> - In the case that there is a different preferred cyrillic >> transliteration table for any specific locale their maintainers may want >> to point me to it so I can supply a separate table/patch. >> - Or they could state explicitly that for some reason they would like to >> exclude their locale from the patch for a default cyrillic >> transliteration altogether. > > As Keld wrote, there are probably separate rules for every language so > I don't think you should treat your rules as universal and include them > in every locale. At first sight, it seems to me they work only for English > (as a destination locale). Also, although it is called "transliteration > from Cyrillic" it seems that it covers only Russian alphabet. What about > other languages which use Cyrillic alphabet but add their own diacritic > characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash, > Mari, Ossetian, Yakut, Tatar, and more. What about languages which use > Cyrillic alphabet but transliterate their respective letters in a different > way than Russian? For example, Russian "Ъ" is (I think) usually skipped > in transliteration, I think you propose "``", but when transliterating from > Bulgarian they usually transliterate this as "ă". > > Few remarks: > > * I think you transliterate "щ" as "shh", wouldn't "shch" be better? > * You transliterate "ц" as "cz", wouldn't "ts" be better? By the way, > in Polish language "cz" is a correct transliteration of "ч". > * You transliterate "й" as "j", this is fine in many languages but wouldn't > "y" be better in English? > * In case of "е": how will you know if it is correct to transliterate it > to "e" or "ie" or "je" or "ye"? > > These remarks are obviously incomplete, your patch deserves much more > attention to review. > > Best regards, > > Rafal > Hi Egor, Thanks for your patience with this one. On 2018-10-03 12:32, Egor Kobylkin wrote: > On 03.10.2018 11:19, Keld Simonsen wrote: >> >> Please note that translitteration of Cyrillic to latin is not universal. >> There are different schemes for for example German, English and Danish, and >> there is also an ISO standard for it. > > Thanks for your feedback, Keld! > > Could the locale maintainers that wouldn't like to include this patch > explicitly state so here? > > That is: > - In the case that there is a different preferred cyrillic > transliteration table for any specific locale their maintainers may want > to point me to it so I can supply a separate table/patch. > - Or they could state explicitly that for some reason they would like to > exclude their locale from the patch for a default cyrillic > transliteration altogether. The Wikipedia article https://en.wikipedia.org/wiki/ISO_9 helps to understand that ISO 9:1995 and GOST 7.79-2000 System A are identical so perhaps you could mention both ISO 9 and the Wikipedia article in the commit log. translit_cyrillic includes every transliteration defined in ISO 9:1995 and GOST 7.79-2000, correct? I think those locales which already have Cyrillic transliteration defined it would be best to leave them as-is (as you've done) unless there are some issues with them, there's probably a good reason why they have been added in the first place. For other locales, using ISO 9 instead of not doing transliteration at all may not be entirely correct but I'd suppose it's better to provide at least some sort of transliteration (even if not entirely correct) than sequences of question marks. But as you say, locale maintainers may know better the case for individual locales. Wrt language-specific differences Keld mentioned, Finnish Wikipedia article on transliteration gives an example, see the table on right at https://fi.wikipedia.org/wiki/Siirtokirjoitus for Russian / international / Finnish / Swedish / English / French / German / Polish / phonetic transliteration of a Russian name. (The table also shows that for correct transliteration ASCII letters are not enough for some languages.) Some of the differences and language-specific aspects are probably impossible to take fully into account within the locale system we have today. For example, in Finnish (the tables at http://jkorpela.fi/iso9.html8 and https://fi.wikipedia.org/wiki/Ven%C3%A4j%C3%A4n_translitterointi might also be helpful): 1) transliteration of Russian is mostly as per ISO 9 but with national differences defined in SFS 4900 2) transliteration of Russian and Ukrainian names have some slight differences according to http://jkorpela.fi/iso9.html8 3) transliteration of a letter depends on its position within a word or pronunciation of adjacent letters, for example U+0435 becomes U+0065 (e) except when at the beginning of a word it becomes U+006A U+0065 (je) Hopefully we'll hear comments from others as well. Once your patch is merged, I'll try to come up with the needed locale-specific changes for fi_FI, some differences referred to in 1) above are straightforward to implement but for 2) and 3) some compromises probably need to be made, unfortunately. Thanks, >> On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote: >>> Ping. >>> >>> Absent of feedback I am wondering if anything could be missing in this >>> patch from the maintainers standpoint. More than two months have passed >>> since the original submission. >>> >>> If I can be of assistance, please do not hesitate to contact me, >>> Egor Kobylkin >>> >>> On 06.08.2018 21:00, Egor Kobylkin wrote: >>>> Dear locale maintainers, >>>> >>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails" >>>> >>>> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1] >>>> >>>> add Cyrillic transliteration table translit_cyrillic file >>>> >>>> https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7] >>>> >>>> to localedata/locales/ and include it in all your locales going forward. >>>> >>>> Patch included inline below. >>>> >>>> This is a re-submission for the consideration for 2.29 on a request from >>>> Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html >>>> >>>> From this patch I have excluded locales that already mention cyrillic or >>>> have a transliteration table for it: >>>> az_AZ >>>> iso14651_t1_common >>>> ky_KG >>>> mn_MN >>>> sr_RS >>>> tg_TJ >>>> tk_TM >>>> tt_RU >>>> uk_UA >>>> uz_UZ >>>> uz_UZ@cyrillic >>>> >>>> Their maintainers are requested to make an explicit decision on how and >>>> whether at all to include this patch. >>>> >>>> >>>> >>>> Current bug effect: >>>> >>>> The glibc wiki explicitly lists this use case as the test example >>>> >>>> https://sourceware.org/glibc/wiki/Locales#Testing_Locales : >>>> >>>> LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < >>>> translit-test-input.txt >>>> >>>> currently it fails on Cyrillic texts in most locales including ru_RU [1] >>>> [8] [9]: >>>> >>>> LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT < >>>> translit-test-input.txt |grep CYRILLIC >>>> >>>> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???. >>>> >>>> - It produces a string of question marks and spaces. >>>> >>>> This is what it should produce and it does so after the patch applied: >>>> >>>> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe >>>> chayu. >>>> >>>> >>>> Root problem and the fix: >>>> >>>> The root problem is the missing transliteration table that I am >>>> supplying here. Furthermore it has to be referenced/included into the >>>> active locale at the compilation time to be used by iconv. >>>> >>>> >>>> >>>> COMMIT MESSAGE: >>>> This translit_cyrillic table enables conversion (e.g. with iconv) from a >>>> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text. >>>> >>>> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of >>>> a transliteration has only ASCII codes but still can be read by a native >>>> speaker. Among other things it is useful for processing the Cyrillic >>>> texts and filenames by programs or on systems that are not specifically >>>> prepared to work with Cyrillic, don't have corresponding fonts installed >>>> or can't handle UTF-8. >>>> >>>> The transliteration table itself is attached as a file translit_cyrillic >>>> [7]. Its content (mapping) is based on GOST 7.79-2000 official source >>>> (Federal Agency on Technical Regulating and Metrology Of Russian >>>> Federation [2]). Technically an independent but identical source [3] was >>>> used and prepared in a spreadsheet [6]. >>>> >>>> The documentation suggests that the transliteration tables inclusion is >>>> done by adding *include "translit_cyrillic";""* string into LC_CTYPE >>>> translit_start section >>>> http://man7.org/linux/man-pages/man5/locale.5.html [5] >>>> Practically I have searched for all locales that have a >>>> translit_start/end stance and generated a patch for them. >>>> >>>> The Cyrillic transliteration of e.g. Russian text may have already >>>> worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that >>>> have their transliteration tables included inline. >>>> However it would not be the standard Russian Cyrillic transliteration as >>>> described above. >>>> I am excluding these locales from this proposed patch. I have written >>>> directly to locale maintainer emails listed in the files. Volodymyr >>>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA), >>>> ???????????? ?????????? <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the >>>> exclusion. >>>> >>>> Links: >>>> >>>> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872 >>>> [2] GOST 7.79-2000 official source >>>> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only >>>> available in low quality gif format) >>>> [3] http://transliteration.ru/gost-7-79-2000/ and >>>> http://www.yfermer.ru/specifications/285821.html >>>> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet >>>> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9 >>>> [5] http://man7.org/linux/man-pages/man5/locale.5.html >>>> [6] Spreadsheet for generating translit_cyrillic >>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 >>>> [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591 >>>> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales >>>> [9] translit-test-input.txt >>>> https://sourceware.org/bugzilla/attachment.cgi?id=8618 >>>> >>>> Best regards, >>>> Egor Kobylkin >>>> >>>> --- >>>> 2018-07-17 Egor Kobylkin <egor@kobylkin.com> >>>> >>>> [BZ #2872] >>>> * locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration >>>> table from Cyrillic to Latin. >>>> * locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit >>>> section. >>>> * locales/aa_DJ: likewise >>>> * locales/af_ZA: likewise >>>> * locales/ak_GH: likewise >>>> * locales/am_ET: likewise >>>> * locales/ar_EG: likewise >>>> * locales/be_BY: likewise >>>> * locales/bem_ZM: likewise >>>> * locales/ber_DZ: likewise >>>> * locales/ber_MA: likewise >>>> * locales/bg_BG: likewise >>>> * locales/bi_VU: likewise >>>> * locales/bn_BD: likewise >>>> * locales/bo_CN: likewise >>>> * locales/ca_ES: likewise >>>> * locales/ce_RU: likewise >>>> * locales/cs_CZ: likewise >>>> * locales/cv_RU: likewise >>>> * locales/cy_GB: likewise >>>> * locales/da_DK: likewise >>>> * locales/de_DE: likewise >>>> * locales/dv_MV: likewise >>>> * locales/dz_BT: likewise >>>> * locales/el_GR: likewise >>>> * locales/en_GB: likewise >>>> * locales/en_NG: likewise >>>> * locales/en_ZM: likewise >>>> * locales/es_CU: likewise >>>> * locales/es_ES: likewise >>>> * locales/et_EE: likewise >>>> * locales/fa_IR: likewise >>>> * locales/ff_SN: likewise >>>> * locales/fi_FI: likewise >>>> * locales/fr_FR: likewise >>>> * locales/ga_IE: likewise >>>> * locales/gd_GB: likewise >>>> * locales/gu_IN: likewise >>>> * locales/gv_GB: likewise >>>> * locales/he_IL: likewise >>>> * locales/hi_IN: likewise >>>> * locales/hif_FJ: likewise >>>> * locales/hr_HR: likewise >>>> * locales/ht_HT: likewise >>>> * locales/hu_HU: likewise >>>> * locales/hy_AM: likewise >>>> * locales/id_ID: likewise >>>> * locales/is_IS: likewise >>>> * locales/it_IT: likewise >>>> * locales/ja_JP: likewise >>>> * locales/kk_KZ: likewise >>>> * locales/km_KH: likewise >>>> * locales/kn_IN: likewise >>>> * locales/ko_KR: likewise >>>> * locales/ks_IN: likewise >>>> * locales/kw_GB: likewise >>>> * locales/lb_LU: likewise >>>> * locales/lg_UG: likewise >>>> * locales/lij_IT: likewise >>>> * locales/ln_CD: likewise >>>> * locales/lo_LA: likewise >>>> * locales/lt_LT: likewise >>>> * locales/lv_LV: likewise >>>> * locales/mg_MG: likewise >>>> * locales/mhr_RU: likewise >>>> * locales/mk_MK: likewise >>>> * locales/ml_IN: likewise >>>> * locales/ms_MY: likewise >>>> * locales/mt_MT: likewise >>>> * locales/nan_TW@latin: likewise >>>> * locales/nb_NO: likewise >>>> * locales/ne_NP: likewise >>>> * locales/nhn_MX: likewise >>>> * locales/niu_NU: likewise >>>> * locales/niu_NZ: likewise >>>> * locales/nl_NL: likewise >>>> * locales/nr_ZA: likewise >>>> * locales/oc_FR: likewise >>>> * locales/om_KE: likewise >>>> * locales/or_IN: likewise >>>> * locales/os_RU: likewise >>>> * locales/pa_IN: likewise >>>> * locales/pa_PK: likewise >>>> * locales/pl_PL: likewise >>>> * locales/pt_PT: likewise >>>> * locales/quz_PE: likewise >>>> * locales/ro_RO: likewise >>>> * locales/ru_RU: likewise >>>> * locales/rw_RW: likewise >>>> * locales/sa_IN: likewise >>>> * locales/sd_IN: likewise >>>> * locales/sd_IN@devanagari: likewise >>>> * locales/sd_PK: likewise >>>> * locales/se_NO: likewise >>>> * locales/sgs_LT: likewise >>>> * locales/si_LK: likewise >>>> * locales/sk_SK: likewise >>>> * locales/sl_SI: likewise >>>> * locales/sm_WS: likewise >>>> * locales/so_SO: likewise >>>> * locales/sq_AL: likewise >>>> * locales/ss_ZA: likewise >>>> * locales/st_ZA: likewise >>>> * locales/sv_SE: likewise >>>> * locales/sw_KE: likewise >>>> * locales/ta_IN: likewise >>>> * locales/te_IN: likewise >>>> * locales/th_TH: likewise >>>> * locales/ti_ET: likewise >>>> * locales/tn_ZA: likewise >>>> * locales/to_TO: likewise >>>> * locales/tpi_PG: likewise >>>> * locales/tr_TR: likewise >>>> * locales/ts_ZA: likewise >>>> * locales/unm_US: likewise >>>> * locales/ur_IN: likewise >>>> * locales/ur_PK: likewise >>>> * locales/ve_ZA: likewise >>>> * locales/vi_VN: likewise >>>> * locales/wa_BE: likewise >>>> * locales/wo_SN: likewise >>>> * locales/xh_ZA: likewise >>>> * locales/yi_US: likewise >>>> * locales/zh_CN: likewise >>>> * locales/zu_ZA: likewise >>>> >>>> >>>> diff -uNr a/localedata/locales/C b/localedata/locales/C >>>> --- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000 >>>> @@ -2292,6 +2292,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ >>>> --- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000 >>>> +++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000 >>>> @@ -70,6 +70,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA >>>> --- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000 >>>> +++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000 >>>> @@ -72,6 +72,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH >>>> --- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000 >>>> +++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000 >>>> @@ -56,6 +56,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET >>>> --- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000 >>>> +++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000 >>>> @@ -1396,6 +1396,7 @@ >>>> <U137A> <U0060><U0039><U0030> >>>> <U137B> <U0060><U0031><U0030><U0030> >>>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> % >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG >>>> --- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000 >>>> +++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -44,6 +44,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY >>>> --- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -69,6 +69,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM >>>> --- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -42,6 +42,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ >>>> --- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -166,6 +166,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA >>>> --- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -86,6 +86,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG >>>> --- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -49,6 +49,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU >>>> --- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -39,6 +39,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD >>>> --- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -63,6 +63,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN >>>> --- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -43,6 +43,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES >>>> --- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -72,6 +72,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU >>>> --- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -39,6 +39,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ >>>> --- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000 >>>> +++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -2311,6 +2311,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU >>>> --- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -109,6 +109,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB >>>> --- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -69,6 +69,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK >>>> --- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -167,6 +167,7 @@ >>>> % LATIN SMALL LETTER O WITH STROKE -> "oe" >>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE >>>> --- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -78,6 +78,7 @@ >>>> % DOUBLE HIGH-REVERSED-9 QUOTATION MARK >>>> <U201F> <U00AB>;<U0022> >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV >>>> --- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -52,6 +52,7 @@ >>>> include "translit_combining";"" >>>> >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT >>>> --- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR >>>> --- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB >>>> --- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -55,6 +55,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG >>>> --- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000 >>>> +++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -50,6 +50,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM >>>> --- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -42,6 +42,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU >>>> --- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES >>>> --- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -73,6 +73,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE >>>> --- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -109,6 +109,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR >>>> --- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -79,6 +79,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN >>>> --- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -42,6 +42,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI >>>> --- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000 >>>> +++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -137,6 +137,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR >>>> --- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> % In France, accents are simply omitted if they cannot be represented. >>>> include "translit_combining";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE >>>> --- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -54,6 +54,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB >>>> --- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -47,6 +47,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN >>>> --- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -62,6 +62,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB >>>> --- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -57,6 +57,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL >>>> --- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN >>>> --- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -61,6 +61,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ >>>> --- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -37,6 +37,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR >>>> --- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -153,6 +153,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT >>>> --- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU >>>> --- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000 >>>> +++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -478,6 +478,7 @@ >>>> <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>" >>>> <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM >>>> --- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -77,6 +77,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID >>>> --- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -55,6 +55,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS >>>> --- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -2161,6 +2161,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT >>>> --- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP >>>> --- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000 >>>> @@ -1682,6 +1682,7 @@ >>>> include "translit_combining";"" >>>> include "translit_cjk_variants";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ >>>> --- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -158,6 +158,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH >>>> --- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -873,6 +873,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN >>>> --- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -63,6 +63,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR >>>> --- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -6099,6 +6099,7 @@ >>>> include "translit_combining";"" >>>> include "translit_hangul";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN >>>> --- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -46,6 +46,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB >>>> --- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU >>>> --- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -78,6 +78,7 @@ >>>> % LATIN SMALL LETTER E WITH CIRCUMFLEX >>>> <U00EA> "<U0065><U005E>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG >>>> --- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -57,6 +57,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT >>>> --- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000 >>>> +++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -47,6 +47,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD >>>> --- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -39,6 +39,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA >>>> --- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -51,6 +51,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT >>>> --- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -77,6 +77,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV >>>> --- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -2122,6 +2122,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG >>>> --- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -55,6 +55,7 @@ >>>> % Accents are simply omitted if they cannot be represented. >>>> include "translit_combining";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU >>>> --- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK >>>> --- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -49,6 +49,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN >>>> --- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> % >>>> diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY >>>> --- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -45,6 +45,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT >>>> --- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -47,6 +47,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/nan_TW@latin >>>> b/localedata/locales/nan_TW@latin >>>> --- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -53,6 +53,7 @@ >>>> % accents are simply omitted if they cannot be represented. >>>> include "translit_combining";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO >>>> --- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -154,6 +154,7 @@ >>>> % LATIN SMALL LETTER O WITH STROKE -> "oe" >>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP >>>> --- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000 >>>> @@ -43,6 +43,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX >>>> --- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU >>>> --- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ >>>> --- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL >>>> --- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000 >>>> +++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -57,6 +57,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA >>>> --- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -66,6 +66,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR >>>> --- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -62,6 +62,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE >>>> --- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -140,6 +140,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN >>>> --- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -62,6 +62,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU >>>> --- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -70,6 +70,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN >>>> --- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -60,6 +60,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK >>>> --- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> % Farsi yeh -> yeh >>>> <U06CC> "<U064A>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL >>>> --- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -142,6 +142,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT >>>> --- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE >>>> --- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -57,6 +57,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO >>>> --- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -144,6 +144,7 @@ >>>> <U0162> "<U021A>";"<U0054>" >>>> <U0163> "<U021B>";"<U0074>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU >>>> --- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -74,6 +74,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW >>>> --- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -45,6 +45,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN >>>> --- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -44,6 +44,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN >>>> --- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -46,6 +46,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sd_IN@devanagari >>>> b/localedata/locales/sd_IN@devanagari >>>> --- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000 >>>> +0000 >>>> +++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000 >>>> +0000 >>>> @@ -44,6 +44,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK >>>> --- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -39,6 +39,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO >>>> --- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000 >>>> @@ -205,6 +205,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT >>>> --- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -59,6 +59,7 @@ >>>> copy "i18n" >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK >>>> --- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -45,6 +45,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK >>>> --- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -68,6 +68,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI >>>> --- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000 >>>> +++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -91,6 +91,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS >>>> --- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -37,6 +37,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO >>>> --- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -70,6 +70,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL >>>> --- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -45,6 +45,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA >>>> --- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -68,6 +68,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA >>>> --- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -64,6 +64,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE >>>> --- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -139,6 +139,7 @@ >>>> % LATIN SMALL LETTER O WITH STROKE -> "oe" >>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE >>>> --- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -44,6 +44,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN >>>> --- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -63,6 +63,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN >>>> --- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -63,6 +63,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH >>>> --- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET >>>> --- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -866,6 +866,7 @@ >>>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> >>>> >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> % >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA >>>> --- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -69,6 +69,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO >>>> --- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -36,6 +36,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG >>>> --- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000 >>>> +++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -37,6 +37,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR >>>> --- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -2430,6 +2430,7 @@ >>>> >>>> % TURKISH LIRA SIGN >>>> <U20BA> "<U0054><U004C>" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/translit_cyrillic >>>> b/localedata/locales/translit_cyrillic >>>> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 >>>> +0000 >>>> +++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000 >>>> +0000 >>>> @@ -0,0 +1,151 @@ >>>> +escape_char / >>>> +comment_char % >>>> + >>>> +% Transliterations that converts cyrillic letters to ascii symbols >>>> inspired by GOST 7.79-2000 >>>> +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872 >>>> +% Generated from UnicodeData.txt with >>>> +% https://sourceware.org/bugzilla/attachment.cgi?id=8590 >>>> +% Up to three characters are required to do a reversible transliteration. >>>> + >>>> +LC_CTYPE >>>> + >>>> +translit_start >>>> + >>>> + >>>> +% CYRILLIC CAPITAL LETTER IO >>>> +<U0401> "<U0059><U004F>";<U0059> >>>> +% CYRILLIC CAPITAL LETTER A >>>> +<U0410> <U0041> >>>> +% CYRILLIC CAPITAL LETTER BE >>>> +<U0411> <U0042> >>>> +% CYRILLIC CAPITAL LETTER VE >>>> +<U0412> <U0056> >>>> +% CYRILLIC CAPITAL LETTER GHE >>>> +<U0413> <U0047> >>>> +% CYRILLIC CAPITAL LETTER DE >>>> +<U0414> <U0044> >>>> +% CYRILLIC CAPITAL LETTER IE >>>> +<U0415> <U0045> >>>> +% CYRILLIC CAPITAL LETTER ZHE >>>> +<U0416> "<U005A><U0048>";<U005A> >>>> +% CYRILLIC CAPITAL LETTER ZE >>>> +<U0417> <U005A> >>>> +% CYRILLIC CAPITAL LETTER I >>>> +<U0418> <U0049> >>>> +% CYRILLIC CAPITAL LETTER SHORT I >>>> +<U0419> <U004A> >>>> +% CYRILLIC CAPITAL LETTER KA >>>> +<U041A> <U004B> >>>> +% CYRILLIC CAPITAL LETTER EL >>>> +<U041B> <U004C> >>>> +% CYRILLIC CAPITAL LETTER EM >>>> +<U041C> <U004D> >>>> +% CYRILLIC CAPITAL LETTER EN >>>> +<U041D> <U004E> >>>> +% CYRILLIC CAPITAL LETTER O >>>> +<U041E> <U004F> >>>> +% CYRILLIC CAPITAL LETTER PE >>>> +<U041F> <U0050> >>>> +% CYRILLIC CAPITAL LETTER ER >>>> +<U0420> <U0052> >>>> +% CYRILLIC CAPITAL LETTER ES >>>> +<U0421> <U0053> >>>> +% CYRILLIC CAPITAL LETTER TE >>>> +<U0422> <U0054> >>>> +% CYRILLIC CAPITAL LETTER U >>>> +<U0423> <U0055> >>>> +% CYRILLIC CAPITAL LETTER EF >>>> +<U0424> <U0046> >>>> +% CYRILLIC CAPITAL LETTER HA >>>> +<U0425> <U0058> >>>> +% CYRILLIC CAPITAL LETTER TSE >>>> +<U0426> "<U0043><U005A>";<U0043> >>>> +% CYRILLIC CAPITAL LETTER CHE >>>> +<U0427> "<U0043><U0048>";<U0043> >>>> +% CYRILLIC CAPITAL LETTER SHA >>>> +<U0428> "<U0053><U0048>";<U0053> >>>> +% CYRILLIC CAPITAL LETTER SHCHA >>>> +<U0429> "<U0053><U0048><U0048>";<U0053> >>>> +% CYRILLIC CAPITAL LETTER HARD SIGN >>>> +<U042A> "<U0060><U0060>";<U0060> >>>> +% CYRILLIC CAPITAL LETTER YERU >>>> +<U042B> "<U0059><U0027>";<U0059> >>>> +% CYRILLIC CAPITAL LETTER SOFT SIGN >>>> +<U042C> <U0060> >>>> +% CYRILLIC CAPITAL LETTER E >>>> +<U042D> "<U0045><U0060>";<U0045> >>>> +% CYRILLIC CAPITAL LETTER YU >>>> +<U042E> "<U0059><U0055>";<U0059> >>>> +% CYRILLIC CAPITAL LETTER YA >>>> +<U042F> "<U0059><U0041>";<U0059> >>>> +% CYRILLIC SMALL LETTER A >>>> +<U0430> <U0061> >>>> +% CYRILLIC SMALL LETTER BE >>>> +<U0431> <U0062> >>>> +% CYRILLIC SMALL LETTER VE >>>> +<U0432> <U0076> >>>> +% CYRILLIC SMALL LETTER GHE >>>> +<U0433> <U0067> >>>> +% CYRILLIC SMALL LETTER DE >>>> +<U0434> <U0064> >>>> +% CYRILLIC SMALL LETTER IE >>>> +<U0435> <U0065> >>>> +% CYRILLIC SMALL LETTER ZHE >>>> +<U0436> "<U007A><U0068>";<U007A> >>>> +% CYRILLIC SMALL LETTER ZE >>>> +<U0437> <U007A> >>>> +% CYRILLIC SMALL LETTER I >>>> +<U0438> <U0069> >>>> +% CYRILLIC SMALL LETTER SHORT I >>>> +<U0439> <U006A> >>>> +% CYRILLIC SMALL LETTER KA >>>> +<U043A> <U006B> >>>> +% CYRILLIC SMALL LETTER EL >>>> +<U043B> <U006C> >>>> +% CYRILLIC SMALL LETTER EM >>>> +<U043C> <U006D> >>>> +% CYRILLIC SMALL LETTER EN >>>> +<U043D> <U006E> >>>> +% CYRILLIC SMALL LETTER O >>>> +<U043E> <U006F> >>>> +% CYRILLIC SMALL LETTER PE >>>> +<U043F> <U0070> >>>> +% CYRILLIC SMALL LETTER ER >>>> +<U0440> <U0072> >>>> +% CYRILLIC SMALL LETTER ES >>>> +<U0441> <U0073> >>>> +% CYRILLIC SMALL LETTER TE >>>> +<U0442> <U0074> >>>> +% CYRILLIC SMALL LETTER U >>>> +<U0443> <U0075> >>>> +% CYRILLIC SMALL LETTER EF >>>> +<U0444> <U0066> >>>> +% CYRILLIC SMALL LETTER HA >>>> +<U0445> <U0078> >>>> +% CYRILLIC SMALL LETTER TSE >>>> +<U0446> "<U0063><U007A>";<U0063> >>>> +% CYRILLIC SMALL LETTER CHE >>>> +<U0447> "<U0063><U0068>";<U0063> >>>> +% CYRILLIC SMALL LETTER SHA >>>> +<U0448> "<U0073><U0068>";<U0073> >>>> +% CYRILLIC SMALL LETTER SHCHA >>>> +<U0449> "<U0073><U0068><U0068>";<U0073> >>>> +% CYRILLIC SMALL LETTER HARD SIGN >>>> +<U044A> "<U0060><U0060>";<U0060> >>>> +% CYRILLIC SMALL LETTER YERU >>>> +<U044B> "<U0079><U0027>";<U0079> >>>> +% CYRILLIC SMALL LETTER SOFT SIGN >>>> +<U044C> <U0060> >>>> +% CYRILLIC SMALL LETTER E >>>> +<U044D> "<U0065><U0060>";<U0065> >>>> +% CYRILLIC SMALL LETTER YU >>>> +<U044E> "<U0079><U0075>";<U0079> >>>> +% CYRILLIC SMALL LETTER YA >>>> +<U044F> "<U0079><U0061>";<U0079> >>>> +% CYRILLIC SMALL LETTER IO >>>> +<U0451> "<U0079><U006F>";<U0079> >>>> + >>>> + >>>> +translit_end >>>> + >>>> +END LC_CTYPE >>>> diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA >>>> --- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -64,6 +64,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US >>>> --- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000 >>>> @@ -48,6 +48,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN >>>> --- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -46,6 +46,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK >>>> --- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> % Farsi yeh -> yeh >>>> <U06CC> "<U064A>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA >>>> --- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -67,6 +67,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN >>>> --- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> % dong sign -> d// -> dd >>>> <U20AB> "<U0111>";"<U0064><U0064>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE >>>> --- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -69,6 +69,7 @@ >>>> <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>" >>>> <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN >>>> --- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -55,6 +55,7 @@ >>>> % Accents are simply omitted if they cannot be represented. >>>> include "translit_combining";"" >>>> >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA >>>> --- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -66,6 +66,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US >>>> --- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -73,6 +73,7 @@ >>>> <U05F0> "<U05D5><U05D5>";"<U0077><U0077>" >>>> <U05F1> "<U05D5><U05D9>";"<U0077><U006A>" >>>> <U05F2> "<U05D9><U05D9>";"<U006A><U006A>" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> END LC_CTYPE >>>> diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN >>>> --- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000 >>>> +++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -58,6 +58,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> >>>> class "hanzi"; / >>>> diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA >>>> --- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000 >>>> +++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000 >>>> @@ -70,6 +70,7 @@ >>>> >>>> translit_start >>>> include "translit_combining";"" >>>> +include "translit_cyrillic";"" >>>> translit_end >>>> END LC_CTYPE >>>> >>>> >>>> >
Hi, Would it make sense to first use ISO 9:1995/GOST 7.79 System A if possible and if not, then fall back to GOST 7.79 System B? Implementation-wise current translit_* files have few examples where a non-ASCII transliteration is tried first before an ASCII fallback. These examples are from translit_neutral: % NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED TRIPLE PRIME <U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>" Thanks, On 2018-10-05 13:29, Egor Kobylkin wrote: > Keld,Marko,Rafal, other locale maintainers, > > this all is written with having in mind a minimal viable fix for this > bug asap. I want to avoid wasting maintainers time getting into > fundamental discussions here (although for perfectly good reasons). > > I see three options: > 1. those locale maintainers that are fine with using ISO > 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it > in their locales (see attached screenshot of the table). > 2. those that that want to have a differing table can create their own > variety based on the spreadsheet I have prepared > https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in > this patch. > 3. those that want to omit a cyrillic transliteration altogether for now > state so and just carry over the bug #2872 from the year 2006. > > Does this make sense to you? > > Just to be super clear on this: the patch is a stopgap _ASCII_ > transliteration table. ASCII being AMERICAN Standard Code for > Information Interchange, that is obviously orthogonal to any > transliteration rule of other countries. As such it is not explicitly > targeting transliteration standards of any country. > > The fact that the patch is reflecting Russian variety of ISO > 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is > available and can be helpful to a majority of cyrillic users b) I have > access to it including via being proficient in Russian. > > It is offered to all the respective locale maintainers as a stopgap > solution. Stopgap in the sense that it is better to have some > transliteration than not to have any at all and carry over the bug from > 2006. That it may be a somewhat officially correct transliteration for > ru_RU is a bonus. In that sense I would dub the discussion on the > correctness for other languages "offtopic". Let me know if this is not OK. > > You are all are correctly mentioning the deficiencies of this approach. > However, I couldn't find a better straightforward approach as of yet. > Happy to hear from you as on how this could be handled. > > There is a danger of being caught in the web of language/country > differences. I propose just pruning the locales that are not comfortable > including this current table. We can address possible solutions in the > second wave of patching. > > I am vary of getting into discussions on specific country variants just > because of the sheer complexity of this topic. It is probably better > addressed by respective maintainers of their locales. I do not see a > "one fits all" solution in this first wave possible. > > I would like to have this "three options plan of action" vetted first > and then we could go to the specific detail. (Like, for instance, what > characters should be included in to the table, and in which > transliteration form.) > > I am looking forward to your reply, > Egor Kobylkin > > P.S. specifically as to how address languages other than Ru included in > GOST_7.79_System_B: we can take the first option left to right from that > table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those > locales/languages but with errors where Ru supersedes their own variants. > > > On 05.10.2018 11:20, Rafal Luzynski wrote: >> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote: >>> >>> On 03.10.2018 11:19, Keld Simonsen wrote: >>>> Hi >>>> >>>> Please note that translitteration of Cyrillic to latin is not universal. >>>> There are different schemes for for example German, English and Danish, and >>>> there is also an ISO standard for it. >>> >>> Thanks for your feedback, Keld! >>> >>> Could the locale maintainers that wouldn't like to include this patch >>> explicitly state so here? >> >> I think it is about me so I must reply. I am sorry about that and the sole >> reason is my lack of time. I'm just a volunteer here, that means it's not >> my regular job to work on locale data nor anything in glibc nor in any other >> open source project. I do these things only in my free time which I don't >> have much. Of course you will see my contributions here and there but they >> are either trivial or take me months to complete. Your patches are on my >> radar but I can't tell any ETA for them. Of course, there are other people >> around here and they are all welcome to come and join. >> >>> That is: >>> - In the case that there is a different preferred cyrillic >>> transliteration table for any specific locale their maintainers may want >>> to point me to it so I can supply a separate table/patch. >>> - Or they could state explicitly that for some reason they would like to >>> exclude their locale from the patch for a default cyrillic >>> transliteration altogether. >> >> As Keld wrote, there are probably separate rules for every language so >> I don't think you should treat your rules as universal and include them >> in every locale. At first sight, it seems to me they work only for English >> (as a destination locale). Also, although it is called "transliteration >> from Cyrillic" it seems that it covers only Russian alphabet. What about >> other languages which use Cyrillic alphabet but add their own diacritic >> characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash, >> Mari, Ossetian, Yakut, Tatar, and more. What about languages which use >> Cyrillic alphabet but transliterate their respective letters in a different >> way than Russian? For example, Russian "Ъ" is (I think) usually skipped >> in transliteration, I think you propose "``", but when transliterating from >> Bulgarian they usually transliterate this as "ă". >> >> Few remarks: >> >> * I think you transliterate "щ" as "shh", wouldn't "shch" be better? >> * You transliterate "ц" as "cz", wouldn't "ts" be better? By the way, >> in Polish language "cz" is a correct transliteration of "ч". >> * You transliterate "й" as "j", this is fine in many languages but wouldn't >> "y" be better in English? >> * In case of "е": how will you know if it is correct to transliterate it >> to "e" or "ie" or "je" or "ye"? >> >> These remarks are obviously incomplete, your patch deserves much more >> attention to review. >> >> Best regards, >> >> Rafal >> >
Hi Marko, I have chosen the System B because it is ASCII compartible. System A is not ASCII compartible (diacritics in target). https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A "GOST 7.79 contains two transliteration tables. System A one Cyrillic character to one Latin character, some with diacritics – identical to ISO 9:1995 System B one Cyrillic character to one or many Latin characters without diacritics " Hope this helps, Egor On 05.10.2018 13:54, Marko Myllynen wrote: > Hi, > > Would it make sense to first use ISO 9:1995/GOST 7.79 System A if > possible and if not, then fall back to GOST 7.79 System B? > > Implementation-wise current translit_* files have few examples where a > non-ASCII transliteration is tried first before an ASCII fallback. These > examples are from translit_neutral: > > % NARROW NO-BREAK SPACE > <U202F> <U00A0>;<U0020> > % REVERSED TRIPLE PRIME > <U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>" > > Thanks, > > On 2018-10-05 13:29, Egor Kobylkin wrote: >> Keld,Marko,Rafal, other locale maintainers, >> >> this all is written with having in mind a minimal viable fix for this >> bug asap. I want to avoid wasting maintainers time getting into >> fundamental discussions here (although for perfectly good reasons). >> >> I see three options: >> 1. those locale maintainers that are fine with using ISO >> 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it >> in their locales (see attached screenshot of the table). >> 2. those that that want to have a differing table can create their own >> variety based on the spreadsheet I have prepared >> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in >> this patch. >> 3. those that want to omit a cyrillic transliteration altogether for now >> state so and just carry over the bug #2872 from the year 2006. >> >> Does this make sense to you? >> >> Just to be super clear on this: the patch is a stopgap _ASCII_ >> transliteration table. ASCII being AMERICAN Standard Code for >> Information Interchange, that is obviously orthogonal to any >> transliteration rule of other countries. As such it is not explicitly >> targeting transliteration standards of any country. >> >> The fact that the patch is reflecting Russian variety of ISO >> 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is >> available and can be helpful to a majority of cyrillic users b) I have >> access to it including via being proficient in Russian. >> >> It is offered to all the respective locale maintainers as a stopgap >> solution. Stopgap in the sense that it is better to have some >> transliteration than not to have any at all and carry over the bug from >> 2006. That it may be a somewhat officially correct transliteration for >> ru_RU is a bonus. In that sense I would dub the discussion on the >> correctness for other languages "offtopic". Let me know if this is not OK. >> >> You are all are correctly mentioning the deficiencies of this approach. >> However, I couldn't find a better straightforward approach as of yet. >> Happy to hear from you as on how this could be handled. >> >> There is a danger of being caught in the web of language/country >> differences. I propose just pruning the locales that are not comfortable >> including this current table. We can address possible solutions in the >> second wave of patching. >> >> I am vary of getting into discussions on specific country variants just >> because of the sheer complexity of this topic. It is probably better >> addressed by respective maintainers of their locales. I do not see a >> "one fits all" solution in this first wave possible. >> >> I would like to have this "three options plan of action" vetted first >> and then we could go to the specific detail. (Like, for instance, what >> characters should be included in to the table, and in which >> transliteration form.) >> >> I am looking forward to your reply, >> Egor Kobylkin >> >> P.S. specifically as to how address languages other than Ru included in >> GOST_7.79_System_B: we can take the first option left to right from that >> table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those >> locales/languages but with errors where Ru supersedes their own variants. >> >> >> On 05.10.2018 11:20, Rafal Luzynski wrote: >>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote: >>>> >>>> On 03.10.2018 11:19, Keld Simonsen wrote: >>>>> Hi >>>>> >>>>> Please note that translitteration of Cyrillic to latin is not universal. >>>>> There are different schemes for for example German, English and Danish, and >>>>> there is also an ISO standard for it. >>>> >>>> Thanks for your feedback, Keld! >>>> >>>> Could the locale maintainers that wouldn't like to include this patch >>>> explicitly state so here? >>> >>> I think it is about me so I must reply. I am sorry about that and the sole >>> reason is my lack of time. I'm just a volunteer here, that means it's not >>> my regular job to work on locale data nor anything in glibc nor in any other >>> open source project. I do these things only in my free time which I don't >>> have much. Of course you will see my contributions here and there but they >>> are either trivial or take me months to complete. Your patches are on my >>> radar but I can't tell any ETA for them. Of course, there are other people >>> around here and they are all welcome to come and join. >>> >>>> That is: >>>> - In the case that there is a different preferred cyrillic >>>> transliteration table for any specific locale their maintainers may want >>>> to point me to it so I can supply a separate table/patch. >>>> - Or they could state explicitly that for some reason they would like to >>>> exclude their locale from the patch for a default cyrillic >>>> transliteration altogether. >>> >>> As Keld wrote, there are probably separate rules for every language so >>> I don't think you should treat your rules as universal and include them >>> in every locale. At first sight, it seems to me they work only for English >>> (as a destination locale). Also, although it is called "transliteration >>> from Cyrillic" it seems that it covers only Russian alphabet. What about >>> other languages which use Cyrillic alphabet but add their own diacritic >>> characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash, >>> Mari, Ossetian, Yakut, Tatar, and more. What about languages which use >>> Cyrillic alphabet but transliterate their respective letters in a different >>> way than Russian? For example, Russian "Ъ" is (I think) usually skipped >>> in transliteration, I think you propose "``", but when transliterating from >>> Bulgarian they usually transliterate this as "ă". >>> >>> Few remarks: >>> >>> * I think you transliterate "щ" as "shh", wouldn't "shch" be better? >>> * You transliterate "ц" as "cz", wouldn't "ts" be better? By the way, >>> in Polish language "cz" is a correct transliteration of "ч". >>> * You transliterate "й" as "j", this is fine in many languages but wouldn't >>> "y" be better in English? >>> * In case of "е": how will you know if it is correct to transliterate it >>> to "e" or "ie" or "je" or "ye"? >>> >>> These remarks are obviously incomplete, your patch deserves much more >>> attention to review. >>> >>> Best regards, >>> >>> Rafal >>> >> > >
Hi, The scheme I proposed would also be ASCII compatible; consider this example: % CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>" "printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv -f ISO-8859-15 -t UTF-8" would produce Š as per System A and "printf \\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as per System B. Thanks, On 2018-10-05 15:00, Egor Kobylkin wrote: > Hi Marko, > > I have chosen the System B because it is ASCII compartible. System A is > not ASCII compartible (diacritics in target). > > https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A > "GOST 7.79 contains two transliteration tables. > > System A > one Cyrillic character to one Latin character, some with diacritics > – identical to ISO 9:1995 > > System B > one Cyrillic character to one or many Latin characters without > diacritics > " > Hope this helps, > Egor > > On 05.10.2018 13:54, Marko Myllynen wrote: >> Hi, >> >> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if >> possible and if not, then fall back to GOST 7.79 System B? >> >> Implementation-wise current translit_* files have few examples where a >> non-ASCII transliteration is tried first before an ASCII fallback. These >> examples are from translit_neutral: >> >> % NARROW NO-BREAK SPACE >> <U202F> <U00A0>;<U0020> >> % REVERSED TRIPLE PRIME >> <U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>" >> >> Thanks, >> >> On 2018-10-05 13:29, Egor Kobylkin wrote: >>> Keld,Marko,Rafal, other locale maintainers, >>> >>> this all is written with having in mind a minimal viable fix for this >>> bug asap. I want to avoid wasting maintainers time getting into >>> fundamental discussions here (although for perfectly good reasons). >>> >>> I see three options: >>> 1. those locale maintainers that are fine with using ISO >>> 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it >>> in their locales (see attached screenshot of the table). >>> 2. those that that want to have a differing table can create their own >>> variety based on the spreadsheet I have prepared >>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in >>> this patch. >>> 3. those that want to omit a cyrillic transliteration altogether for now >>> state so and just carry over the bug #2872 from the year 2006. >>> >>> Does this make sense to you? >>> >>> Just to be super clear on this: the patch is a stopgap _ASCII_ >>> transliteration table. ASCII being AMERICAN Standard Code for >>> Information Interchange, that is obviously orthogonal to any >>> transliteration rule of other countries. As such it is not explicitly >>> targeting transliteration standards of any country. >>> >>> The fact that the patch is reflecting Russian variety of ISO >>> 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is >>> available and can be helpful to a majority of cyrillic users b) I have >>> access to it including via being proficient in Russian. >>> >>> It is offered to all the respective locale maintainers as a stopgap >>> solution. Stopgap in the sense that it is better to have some >>> transliteration than not to have any at all and carry over the bug from >>> 2006. That it may be a somewhat officially correct transliteration for >>> ru_RU is a bonus. In that sense I would dub the discussion on the >>> correctness for other languages "offtopic". Let me know if this is not OK. >>> >>> You are all are correctly mentioning the deficiencies of this approach. >>> However, I couldn't find a better straightforward approach as of yet. >>> Happy to hear from you as on how this could be handled. >>> >>> There is a danger of being caught in the web of language/country >>> differences. I propose just pruning the locales that are not comfortable >>> including this current table. We can address possible solutions in the >>> second wave of patching. >>> >>> I am vary of getting into discussions on specific country variants just >>> because of the sheer complexity of this topic. It is probably better >>> addressed by respective maintainers of their locales. I do not see a >>> "one fits all" solution in this first wave possible. >>> >>> I would like to have this "three options plan of action" vetted first >>> and then we could go to the specific detail. (Like, for instance, what >>> characters should be included in to the table, and in which >>> transliteration form.) >>> >>> I am looking forward to your reply, >>> Egor Kobylkin >>> >>> P.S. specifically as to how address languages other than Ru included in >>> GOST_7.79_System_B: we can take the first option left to right from that >>> table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those >>> locales/languages but with errors where Ru supersedes their own variants. >>> >>> >>> On 05.10.2018 11:20, Rafal Luzynski wrote: >>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote: >>>>> >>>>> On 03.10.2018 11:19, Keld Simonsen wrote: >>>>>> Hi >>>>>> >>>>>> Please note that translitteration of Cyrillic to latin is not universal. >>>>>> There are different schemes for for example German, English and Danish, and >>>>>> there is also an ISO standard for it. >>>>> >>>>> Thanks for your feedback, Keld! >>>>> >>>>> Could the locale maintainers that wouldn't like to include this patch >>>>> explicitly state so here? >>>> >>>> I think it is about me so I must reply. I am sorry about that and the sole >>>> reason is my lack of time. I'm just a volunteer here, that means it's not >>>> my regular job to work on locale data nor anything in glibc nor in any other >>>> open source project. I do these things only in my free time which I don't >>>> have much. Of course you will see my contributions here and there but they >>>> are either trivial or take me months to complete. Your patches are on my >>>> radar but I can't tell any ETA for them. Of course, there are other people >>>> around here and they are all welcome to come and join. >>>> >>>>> That is: >>>>> - In the case that there is a different preferred cyrillic >>>>> transliteration table for any specific locale their maintainers may want >>>>> to point me to it so I can supply a separate table/patch. >>>>> - Or they could state explicitly that for some reason they would like to >>>>> exclude their locale from the patch for a default cyrillic >>>>> transliteration altogether. >>>> >>>> As Keld wrote, there are probably separate rules for every language so >>>> I don't think you should treat your rules as universal and include them >>>> in every locale. At first sight, it seems to me they work only for English >>>> (as a destination locale). Also, although it is called "transliteration >>>> from Cyrillic" it seems that it covers only Russian alphabet. What about >>>> other languages which use Cyrillic alphabet but add their own diacritic >>>> characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash, >>>> Mari, Ossetian, Yakut, Tatar, and more. What about languages which use >>>> Cyrillic alphabet but transliterate their respective letters in a different >>>> way than Russian? For example, Russian "Ъ" is (I think) usually skipped >>>> in transliteration, I think you propose "``", but when transliterating from >>>> Bulgarian they usually transliterate this as "ă". >>>> >>>> Few remarks: >>>> >>>> * I think you transliterate "щ" as "shh", wouldn't "shch" be better? >>>> * You transliterate "ц" as "cz", wouldn't "ts" be better? By the way, >>>> in Polish language "cz" is a correct transliteration of "ч". >>>> * You transliterate "й" as "j", this is fine in many languages but wouldn't >>>> "y" be better in English? >>>> * In case of "е": how will you know if it is correct to transliterate it >>>> to "e" or "ie" or "je" or "ye"? >>>> >>>> These remarks are obviously incomplete, your patch deserves much more >>>> attention to review. >>>> >>>> Best regards, >>>> >>>> Rafal >>>> >>> >> >> >
After some kind help from Marko in the offline discussion I realized the multi/single character approach I originally took was against the of the iconv(1) logic anyway. So there is no harm in dropping it and adopting Marko's suggestion instead. I will do so and will resubmit the patch with ISO 9:1995/GOST 7.79 System A + fallback to GOST 7.79 System B (for ASCII). However this doesn't resolve the issue for ASCII part being different for various locales. Again, I am offering the locale maintainers to let me know if they want to 1) adopt the one I am supplying, 2) write their own or 3) ignore the patch altogether. Your feedback is appreciated! This is the relevant part that helped: > The first part (ISO-8859-15 or ASCII) defines the target encoding for > iconv(1). //TRANSLIT is described in the iconv(1) man page as: > > If the string //TRANSLIT is appended to to-encoding, characters > being converted are transliterated when needed and possible. This > means that when a character cannot be represented in the target > character set, it can be approximated through one or sev‐ eral > similar looking characters. Characters that are outside of the > target character set and cannot be transliterated are replaced > with a question mark (?) in the output. > > So in the above examples, iconv(1) encounters the character U+0428 > which is not part of either of the target encoding and since > //TRANSLIT is specified, iconv(1) tries transliteration according to > the rules defined above, in case of ASCII U+0160 is not part of the > target encoding so the next alternative is used. Bests, Egor Kobylkin On 05.10.2018 14:21, Marko Myllynen wrote: > Hi, > > The scheme I proposed would also be ASCII compatible; consider this > example: > > % CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>" > > "printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv > -f ISO-8859-15 -t UTF-8" would produce Š as per System A and "printf > \\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as > per System B. > > Thanks, > > On 2018-10-05 15:00, Egor Kobylkin wrote: >> Hi Marko, >> >> I have chosen the System B because it is ASCII compartible. System >> A is not ASCII compartible (diacritics in target). >> >> https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A >> >> >> "GOST 7.79 contains two transliteration tables. >> >> System A one Cyrillic character to one Latin character, some with >> diacritics – identical to ISO 9:1995 >> >> System B one Cyrillic character to one or many Latin characters >> without diacritics " Hope this helps, Egor >> >> On 05.10.2018 13:54, Marko Myllynen wrote: >>> Hi, >>> >>> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if >>> possible and if not, then fall back to GOST 7.79 System B? >>> >>> Implementation-wise current translit_* files have few examples >>> where a non-ASCII transliteration is tried first before an ASCII >>> fallback. These examples are from translit_neutral: >>> >>> % NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED >>> TRIPLE PRIME <U2037> >>> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>" >>> >>> Thanks, >>> >>> On 2018-10-05 13:29, Egor Kobylkin wrote: >>>> Keld,Marko,Rafal, other locale maintainers, >>>> >>>> this all is written with having in mind a minimal viable fix >>>> for this bug asap. I want to avoid wasting maintainers time >>>> getting into fundamental discussions here (although for >>>> perfectly good reasons). >>>> >>>> I see three options: 1. those locale maintainers that are fine >>>> with using ISO 9:1995/GOST_7.79_System_B cyrillic >>>> transliteration table (Ru) include it in their locales (see >>>> attached screenshot of the table). 2. those that that want to >>>> have a differing table can create their own variety based on >>>> the spreadsheet I have prepared >>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and >>>> include it in this patch. 3. those that want to omit a >>>> cyrillic transliteration altogether for now state so and just >>>> carry over the bug #2872 from the year 2006. >>>> >>>> Does this make sense to you? >>>> >>>> Just to be super clear on this: the patch is a stopgap _ASCII_ >>>> transliteration table. ASCII being AMERICAN Standard Code for >>>> Information Interchange, that is obviously orthogonal to any >>>> transliteration rule of other countries. As such it is not >>>> explicitly targeting transliteration standards of any country. >>>> >>>> The fact that the patch is reflecting Russian variety of ISO >>>> 9:1995/GOST_7.79_System_B is because a) ISO >>>> 9:1995/GOST_7.79_System_B is available and can be helpful to a >>>> majority of cyrillic users b) I have access to it including >>>> via being proficient in Russian. >>>> >>>> It is offered to all the respective locale maintainers as a >>>> stopgap solution. Stopgap in the sense that it is better to >>>> have some transliteration than not to have any at all and >>>> carry over the bug from 2006. That it may be a somewhat >>>> officially correct transliteration for ru_RU is a bonus. In >>>> that sense I would dub the discussion on the correctness for >>>> other languages "offtopic". Let me know if this is not OK. >>>> >>>> You are all are correctly mentioning the deficiencies of this >>>> approach. However, I couldn't find a better straightforward >>>> approach as of yet. Happy to hear from you as on how this >>>> could be handled. >>>> >>>> There is a danger of being caught in the web of >>>> language/country differences. I propose just pruning the >>>> locales that are not comfortable including this current table. >>>> We can address possible solutions in the second wave of >>>> patching. >>>> >>>> I am vary of getting into discussions on specific country >>>> variants just because of the sheer complexity of this topic. >>>> It is probably better addressed by respective maintainers of >>>> their locales. I do not see a "one fits all" solution in this >>>> first wave possible. >>>> >>>> I would like to have this "three options plan of action" >>>> vetted first and then we could go to the specific detail. >>>> (Like, for instance, what characters should be included in to >>>> the table, and in which transliteration form.) >>>> >>>> I am looking forward to your reply, Egor Kobylkin >>>> >>>> P.S. specifically as to how address languages other than Ru >>>> included in GOST_7.79_System_B: we can take the first option >>>> left to right from that table (Ru,By,Uk,Bg,Mk). Then it will >>>> technically work for all those locales/languages but with >>>> errors where Ru supersedes their own variants. >>>> >>>> >>>> On 05.10.2018 11:20, Rafal Luzynski wrote: >>>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote: >>>>>> >>>>>> On 03.10.2018 11:19, Keld Simonsen wrote: >>>>>>> Hi >>>>>>> >>>>>>> Please note that translitteration of Cyrillic to latin >>>>>>> is not universal. There are different schemes for for >>>>>>> example German, English and Danish, and there is also an >>>>>>> ISO standard for it. >>>>>> >>>>>> Thanks for your feedback, Keld! >>>>>> >>>>>> Could the locale maintainers that wouldn't like to include >>>>>> this patch explicitly state so here? >>>>> >>>>> I think it is about me so I must reply. I am sorry about >>>>> that and the sole reason is my lack of time. I'm just a >>>>> volunteer here, that means it's not my regular job to work >>>>> on locale data nor anything in glibc nor in any other open >>>>> source project. I do these things only in my free time >>>>> which I don't have much. Of course you will see my >>>>> contributions here and there but they are either trivial or >>>>> take me months to complete. Your patches are on my radar but >>>>> I can't tell any ETA for them. Of course, there are other >>>>> people around here and they are all welcome to come and >>>>> join. >>>>> >>>>>> That is: - In the case that there is a different preferred >>>>>> cyrillic transliteration table for any specific locale >>>>>> their maintainers may want to point me to it so I can >>>>>> supply a separate table/patch. - Or they could state >>>>>> explicitly that for some reason they would like to exclude >>>>>> their locale from the patch for a default cyrillic >>>>>> transliteration altogether. >>>>> >>>>> As Keld wrote, there are probably separate rules for every >>>>> language so I don't think you should treat your rules as >>>>> universal and include them in every locale. At first sight, >>>>> it seems to me they work only for English (as a destination >>>>> locale). Also, although it is called "transliteration from >>>>> Cyrillic" it seems that it covers only Russian alphabet. What >>>>> about other languages which use Cyrillic alphabet but add >>>>> their own diacritic characters? Think about Belarusian, >>>>> Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut, >>>>> Tatar, and more. What about languages which use Cyrillic >>>>> alphabet but transliterate their respective letters in a >>>>> different way than Russian? For example, Russian "Ъ" is (I >>>>> think) usually skipped in transliteration, I think you >>>>> propose "``", but when transliterating from Bulgarian they >>>>> usually transliterate this as "ă". >>>>> >>>>> Few remarks: >>>>> >>>>> * I think you transliterate "щ" as "shh", wouldn't "shch" be >>>>> better? * You transliterate "ц" as "cz", wouldn't "ts" be >>>>> better? By the way, in Polish language "cz" is a correct >>>>> transliteration of "ч". * You transliterate "й" as "j", this >>>>> is fine in many languages but wouldn't "y" be better in >>>>> English? * In case of "е": how will you know if it is >>>>> correct to transliterate it to "e" or "ie" or "je" or "ye"? >>>>> >>>>> These remarks are obviously incomplete, your patch deserves >>>>> much more attention to review. >>>>> >>>>> Best regards, >>>>> >>>>> Rafal >>>>> >>>> >>> >>> >> > >
Hi, Thanks for the update. I have few mostly cosmetic comments below, hopefully we'll hear from others whether they agree with this direction. - Please add the standard glibc locale header (see the existing translit_* files for reference) - Consider wrapping the header lines at or around column 70-72 - Consider describing which characters, character ranges, or blocks are supported (perhaps also describe why some of those are not included, see e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode) - Please remove trailing whitespaces and spaces after ; - No duplicates: % CYRILLIC SMALL LETTER IE <U0435> <U0065>; <U0065> should become: % CYRILLIC SMALL LETTER IE <U0435> <U0065> - There are few issues with the definitions: % CYRILLIC CAPITAL LETTER U <U0423> <U0055>; <U0055> % CYRILLIC UNDEFINED <U0423><U0423> <U00DA>; "<U0055><U0060>" % CYRILLIC SMALL LETTER U <U0443> <U0075>; <U0075> % CYRILLIC UNDEFINED <U0443><U0443> <U00FA>; "<U0075><U0060>" I wonder would it be possible to automate generation of this file so that issues like the above could avoided? But perhaps that could be the next step once this initial patch lands. Thanks, On 2018-10-05 23:47, Egor Kobylkin wrote: > After some kind help from Marko in the offline discussion > I realized the multi/single character approach I originally took was > against the of the iconv(1) logic anyway. So there is no harm in > dropping it and adopting Marko's suggestion instead. I will do so and > will resubmit the patch with ISO 9:1995/GOST 7.79 System A + fallback to > GOST 7.79 System B (for ASCII). > > However this doesn't resolve the issue for ASCII part being different > for various locales. Again, I am offering the locale maintainers to let > me know if they want to 1) adopt the one I am supplying, 2) write their > own or 3) ignore the patch altogether. Your feedback is appreciated! > > This is the relevant part that helped: >> The first part (ISO-8859-15 or ASCII) defines the target encoding for >> iconv(1). //TRANSLIT is described in the iconv(1) man page as: >> >> If the string //TRANSLIT is appended to to-encoding, characters >> being converted are transliterated when needed and possible. This >> means that when a character cannot be represented in the target >> character set, it can be approximated through one or sev‐ eral >> similar looking characters. Characters that are outside of the >> target character set and cannot be transliterated are replaced >> with a question mark (?) in the output. >> >> So in the above examples, iconv(1) encounters the character U+0428 >> which is not part of either of the target encoding and since >> //TRANSLIT is specified, iconv(1) tries transliteration according to >> the rules defined above, in case of ASCII U+0160 is not part of the >> target encoding so the next alternative is used. > > Bests, > Egor Kobylkin > > On 05.10.2018 14:21, Marko Myllynen wrote: >> Hi, >> >> The scheme I proposed would also be ASCII compatible; consider this >> example: >> >> % CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>" >> >> "printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv >> -f ISO-8859-15 -t UTF-8" would produce Š as per System A and "printf >> \\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as >> per System B. >> >> Thanks, >> >> On 2018-10-05 15:00, Egor Kobylkin wrote: >>> Hi Marko, >>> >>> I have chosen the System B because it is ASCII compartible. System >>> A is not ASCII compartible (diacritics in target). >>> >>> https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A >>> >>> >>> > "GOST 7.79 contains two transliteration tables. >>> >>> System A one Cyrillic character to one Latin character, some with >>> diacritics – identical to ISO 9:1995 >>> >>> System B one Cyrillic character to one or many Latin characters >>> without diacritics " Hope this helps, Egor >>> >>> On 05.10.2018 13:54, Marko Myllynen wrote: >>>> Hi, >>>> >>>> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if >>>> possible and if not, then fall back to GOST 7.79 System B? >>>> >>>> Implementation-wise current translit_* files have few examples >>>> where a non-ASCII transliteration is tried first before an ASCII >>>> fallback. These examples are from translit_neutral: >>>> >>>> % NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED >>>> TRIPLE PRIME <U2037> >>>> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>" >>>> >>>> Thanks, >>>> >>>> On 2018-10-05 13:29, Egor Kobylkin wrote: >>>>> Keld,Marko,Rafal, other locale maintainers, >>>>> >>>>> this all is written with having in mind a minimal viable fix >>>>> for this bug asap. I want to avoid wasting maintainers time >>>>> getting into fundamental discussions here (although for >>>>> perfectly good reasons). >>>>> >>>>> I see three options: 1. those locale maintainers that are fine >>>>> with using ISO 9:1995/GOST_7.79_System_B cyrillic >>>>> transliteration table (Ru) include it in their locales (see >>>>> attached screenshot of the table). 2. those that that want to >>>>> have a differing table can create their own variety based on >>>>> the spreadsheet I have prepared >>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and >>>>> include it in this patch. 3. those that want to omit a >>>>> cyrillic transliteration altogether for now state so and just >>>>> carry over the bug #2872 from the year 2006. >>>>> >>>>> Does this make sense to you? >>>>> >>>>> Just to be super clear on this: the patch is a stopgap _ASCII_ >>>>> transliteration table. ASCII being AMERICAN Standard Code for >>>>> Information Interchange, that is obviously orthogonal to any >>>>> transliteration rule of other countries. As such it is not >>>>> explicitly targeting transliteration standards of any country. >>>>> >>>>> The fact that the patch is reflecting Russian variety of ISO >>>>> 9:1995/GOST_7.79_System_B is because a) ISO >>>>> 9:1995/GOST_7.79_System_B is available and can be helpful to a >>>>> majority of cyrillic users b) I have access to it including >>>>> via being proficient in Russian. >>>>> >>>>> It is offered to all the respective locale maintainers as a >>>>> stopgap solution. Stopgap in the sense that it is better to >>>>> have some transliteration than not to have any at all and >>>>> carry over the bug from 2006. That it may be a somewhat >>>>> officially correct transliteration for ru_RU is a bonus. In >>>>> that sense I would dub the discussion on the correctness for >>>>> other languages "offtopic". Let me know if this is not OK. >>>>> >>>>> You are all are correctly mentioning the deficiencies of this >>>>> approach. However, I couldn't find a better straightforward >>>>> approach as of yet. Happy to hear from you as on how this >>>>> could be handled. >>>>> >>>>> There is a danger of being caught in the web of >>>>> language/country differences. I propose just pruning the >>>>> locales that are not comfortable including this current table. >>>>> We can address possible solutions in the second wave of >>>>> patching. >>>>> >>>>> I am vary of getting into discussions on specific country >>>>> variants just because of the sheer complexity of this topic. >>>>> It is probably better addressed by respective maintainers of >>>>> their locales. I do not see a "one fits all" solution in this >>>>> first wave possible. >>>>> >>>>> I would like to have this "three options plan of action" >>>>> vetted first and then we could go to the specific detail. >>>>> (Like, for instance, what characters should be included in to >>>>> the table, and in which transliteration form.) >>>>> >>>>> I am looking forward to your reply, Egor Kobylkin >>>>> >>>>> P.S. specifically as to how address languages other than Ru >>>>> included in GOST_7.79_System_B: we can take the first option >>>>> left to right from that table (Ru,By,Uk,Bg,Mk). Then it will >>>>> technically work for all those locales/languages but with >>>>> errors where Ru supersedes their own variants. >>>>> >>>>> >>>>> On 05.10.2018 11:20, Rafal Luzynski wrote: >>>>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote: >>>>>>> >>>>>>> On 03.10.2018 11:19, Keld Simonsen wrote: >>>>>>>> Hi >>>>>>>> >>>>>>>> Please note that translitteration of Cyrillic to latin >>>>>>>> is not universal. There are different schemes for for >>>>>>>> example German, English and Danish, and there is also an >>>>>>>> ISO standard for it. >>>>>>> >>>>>>> Thanks for your feedback, Keld! >>>>>>> >>>>>>> Could the locale maintainers that wouldn't like to include >>>>>>> this patch explicitly state so here? >>>>>> >>>>>> I think it is about me so I must reply. I am sorry about >>>>>> that and the sole reason is my lack of time. I'm just a >>>>>> volunteer here, that means it's not my regular job to work >>>>>> on locale data nor anything in glibc nor in any other open >>>>>> source project. I do these things only in my free time >>>>>> which I don't have much. Of course you will see my >>>>>> contributions here and there but they are either trivial or >>>>>> take me months to complete. Your patches are on my radar but >>>>>> I can't tell any ETA for them. Of course, there are other >>>>>> people around here and they are all welcome to come and >>>>>> join. >>>>>> >>>>>>> That is: - In the case that there is a different preferred >>>>>>> cyrillic transliteration table for any specific locale >>>>>>> their maintainers may want to point me to it so I can >>>>>>> supply a separate table/patch. - Or they could state >>>>>>> explicitly that for some reason they would like to exclude >>>>>>> their locale from the patch for a default cyrillic >>>>>>> transliteration altogether. >>>>>> >>>>>> As Keld wrote, there are probably separate rules for every >>>>>> language so I don't think you should treat your rules as >>>>>> universal and include them in every locale. At first sight, >>>>>> it seems to me they work only for English (as a destination >>>>>> locale). Also, although it is called "transliteration from >>>>>> Cyrillic" it seems that it covers only Russian alphabet. What >>>>>> about other languages which use Cyrillic alphabet but add >>>>>> their own diacritic characters? Think about Belarusian, >>>>>> Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut, >>>>>> Tatar, and more. What about languages which use Cyrillic >>>>>> alphabet but transliterate their respective letters in a >>>>>> different way than Russian? For example, Russian "Ъ" is (I >>>>>> think) usually skipped in transliteration, I think you >>>>>> propose "``", but when transliterating from Bulgarian they >>>>>> usually transliterate this as "ă". >>>>>> >>>>>> Few remarks: >>>>>> >>>>>> * I think you transliterate "щ" as "shh", wouldn't "shch" be >>>>>> better? * You transliterate "ц" as "cz", wouldn't "ts" be >>>>>> better? By the way, in Polish language "cz" is a correct >>>>>> transliteration of "ч". * You transliterate "й" as "j", this >>>>>> is fine in many languages but wouldn't "y" be better in >>>>>> English? * In case of "е": how will you know if it is >>>>>> correct to transliterate it to "e" or "ie" or "je" or "ye"? >>>>>> >>>>>> These remarks are obviously incomplete, your patch deserves >>>>>> much more attention to review. >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Rafal >>>>>> >>>>> >>>> >>>> >>> >> >> >
5.10.2018 12:36 Egor Kobylkin <egor@kobylkin.com> wrote: > [...] > I see three options: > 1. those locale maintainers that are fine with using ISO > 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it > in their locales. https://sourceware.org/bugzilla/attachment.cgi?id=11289 > 2. those that that want to have a differing table can create their own > variety based on the spreadsheet I have prepared > https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in > this patch. > 3. those that want to omit a cyrillic transliteration altogether for now > state so and just carry over the bug #2872 from the year 2006. > > Does this make sense to you? The problem is that we don't have a separate maintainer for each locale, we have only 2 maintainers for about 200 locales and we must represent them all. Sometimes a locale may happen to be our own native locale or of someone in this list, or it may be a locale which we accidentally can speak as a foreign language, or we may have friends who can speak it. Or it may be totally unknown and we still must somehow handle it. I think that these transliteration rules should be included in multiple locales on "opt-in" basis rather than "opt-out". I mean, we should not include them in all locales unless someone explicitly provides a different rules. Instead, I think we should add them (maybe with modification) only to those locales where we have a good reason to think they will work. Particularly, I think that those rules will not be helpful at all for the languages which use neither Latin nor Cyrillic alphabet. > [...] > The fact that the patch is reflecting Russian variety of ISO > 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is > available and can be helpful to a majority of cyrillic users b) I have > access to it including via being proficient in Russian. I took a look at these standards and as first I doubted they may be correct for English language now I understand they are created for Russian users. Therefore I think it is pretty correct to include them to Russian locale data. Will it be OK if we say that it is only for Russian language? Will it be satisfying for you and/or your users? > It is offered to all the respective locale maintainers as a stopgap > solution. Stopgap in the sense that it is better to have some > transliteration than not to have any at all and carry over the bug from > 2006. That it may be a somewhat officially correct transliteration for > ru_RU is a bonus. In that sense I would dub the discussion on the > correctness for other languages "offtopic". Let me know if this is not OK. If you refer to other languages than Russian which also use the Cyrillic alphabet but need a different transliteration rules than Russian for the same characters then it is OK for me now. I am afraid that the iconv algorithm does not handle such case. Of course, we should add this missing feature eventually but I do not volunteer to do it now. > [...] > P.S. specifically as to how address languages other than Ru included in > GOST_7.79_System_B: we can take the first option left to right from that > table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those > locales/languages but with errors where Ru supersedes their own variants. Makes sense, as long as we cannot select the source language now. But, while at this, is there anything that stops are from adding transliteration rules for additional Cyrillic characters not used in Russian but used in other languages? Regards, Rafal
8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote: > Hi, > > Thanks for the update. I have few mostly cosmetic comments below, > hopefully we'll hear from others whether they agree with this direction. > > - Please add the standard glibc locale header (see the existing > translit_* files for reference) > - Consider wrapping the header lines at or around column 70-72 > - Consider describing which characters, character ranges, or blocks are > supported (perhaps also describe why some of those are not included, see > e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode) > - Please remove trailing whitespaces and spaces after ; Thanks for this, Marko. While at this, in the ChangeLog and in the commit message these paths: * locales/aa_DJ: likewise 1. Should be a relative path starting in the root directory of glibc source, that is: "* localedata/locales/aa_DJ". 2. Should be "Likewise." (starting with an uppercase and ending with a dot). > - No duplicates: > > % CYRILLIC SMALL LETTER IE > <U0435> <U0065>; <U0065> > > should become: > > % CYRILLIC SMALL LETTER IE > <U0435> <U0065> > > - There are few issues with the definitions: > > % CYRILLIC CAPITAL LETTER U > <U0423> <U0055>; <U0055> > % CYRILLIC UNDEFINED > <U0423><U0423> <U00DA>; "<U0055><U0060>" > > % CYRILLIC SMALL LETTER U > <U0443> <U0075>; <U0075> > % CYRILLIC UNDEFINED > <U0443><U0443> <U00FA>; "<U0075><U0060>" Are the duplicates here because some Cyrillic letters may have multiple Latin transliterations depending on the context, for example Cyrillic IE must be transliterated sometimes as "e", sometimes as "ie", sometimes as "ye" or "je"? Can we provide rules for groups of characters instead? > I wonder would it be possible to automate generation of this file so > that issues like the above could avoided? But perhaps that could be the > next step once this initial patch lands. I agree with this. Regards, Rafal
Hi Rafal, > But, while at this, is there anything that stops are from adding > transliteration rules for additional Cyrillic characters not used in > Russian but used in other languages? Just to make sure we are not talking at cross purposes. Since your last email on this topic on the suggestion from Marko I have already implemented ISO 9 transliteration for all characters there are. This should cover most if not all Slavic Cyrillic. You seem to have just noticed and replied to this email of Marko as I write mine. Pls also check the Spreadsheet version I have just uploaded https://sourceware.org/bugzilla/attachment.cgi?id=11298 I am currently absorbing Marko's further suggestions and correction to that one and will get back for more discussion once done there. I am reading your suggestions and taking them to my heart, be sure of that. Two professional translators independently indicated the difference between transliteration and transcription to me. Transliteration is normative (letter for letter) and transcription is phonetic - letter for whatever combination of Latin letters in the target language that sounds like it for a native speaker. While transliteration should be easy to cover for all those languages via ISO 9, transcription is inherently language specific. The problem is we are (mis)using the transcription as transliteration to ASCII because ASCII set of characters does not allow for proper transcription. Another problem is that to be really useful the ASCII transliteration should work outside of source locale (i.e. not only ru_RU but en_US, de_DE, en_DE, es_ES etc. or even just C locale). In fact for myself I would be committed to do all work needed to cover at least C, en_US, ru_RU, de_DE in that order. ru_RU as a "courtesy", I am not really using it but hope more contributors for locales may come because of that and fix my bugs :-). > The problem is that we don't have a separate maintainer for each > locale, we have only 2 maintainers for about 200 locales and we must > represent them all. It was not clear to me that glibc team can not fall back on the individual locale maintainers to make the decision. But then it may make the decision making even easier. If you guys have a list of requirements (may be implicit until now) could you please shoot them my way? We can also certainly just keep this thread up and have all issues ironed out. Anyway hopefully with ISO 9 as a first column in the translit_cyrillic we cover the issue of the completeness of transliteration now. What we need to figure out is transcription/transliteration to ASCII - second column. Are we sharing the same view on this? Speaking on decision making - maybe I can get an officially certified court translator to answer our questions. Do you care to put a list together of questions you would like answered to make a decision on the table/inclusion into various locales? Hope this helps, Egor On 09.10.2018 00:04, Rafal Luzynski wrote: > 5.10.2018 12:36 Egor Kobylkin <egor@kobylkin.com> wrote: >> [...] I see three options: 1. those locale maintainers that are >> fine with using ISO 9:1995/GOST_7.79_System_B cyrillic >> transliteration table (Ru) include it in their locales. >> https://sourceware.org/bugzilla/attachment.cgi?id=11289 2. those >> that that want to have a differing table can create their own >> variety based on the spreadsheet I have prepared >> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include >> it in this patch. 3. those that want to omit a cyrillic >> transliteration altogether for now state so and just carry over the >> bug #2872 from the year 2006. >> >> Does this make sense to you? > > The problem is that we don't have a separate maintainer for each > locale, we have only 2 maintainers for about 200 locales and we must > represent them all. Sometimes a locale may happen to be our own > native locale or of someone in this list, or it may be a locale which > we accidentally can speak as a foreign language, or we may have > friends who can speak it. Or it may be totally unknown and we still > must somehow handle it. > > I think that these transliteration rules should be included in > multiple locales on "opt-in" basis rather than "opt-out". I mean, we > should not include them in all locales unless someone explicitly > provides a different rules. Instead, I think we should add them > (maybe with modification) only to those locales where we have a good > reason to think they will work. > > Particularly, I think that those rules will not be helpful at all > for the languages which use neither Latin nor Cyrillic alphabet. > >> [...] The fact that the patch is reflecting Russian variety of ISO >> 9:1995/GOST_7.79_System_B is because a) ISO >> 9:1995/GOST_7.79_System_B is available and can be helpful to a >> majority of cyrillic users b) I have access to it including via >> being proficient in Russian. > > I took a look at these standards and as first I doubted they may be > correct for English language now I understand they are created for > Russian users. Therefore I think it is pretty correct to include > them to Russian locale data. Will it be OK if we say that it is only > for Russian language? Will it be satisfying for you and/or your > users? > >> It is offered to all the respective locale maintainers as a >> stopgap solution. Stopgap in the sense that it is better to have >> some transliteration than not to have any at all and carry over the >> bug from 2006. That it may be a somewhat officially correct >> transliteration for ru_RU is a bonus. In that sense I would dub the >> discussion on the correctness for other languages "offtopic". Let >> me know if this is not OK. > > If you refer to other languages than Russian which also use the > Cyrillic alphabet but need a different transliteration rules than > Russian for the same characters then it is OK for me now. I am > afraid that the iconv algorithm does not handle such case. Of > course, we should add this missing feature eventually but I do not > volunteer to do it now. > >> [...] P.S. specifically as to how address languages other than Ru >> included in GOST_7.79_System_B: we can take the first option left >> to right from that table (Ru,By,Uk,Bg,Mk). Then it will technically >> work for all those locales/languages but with errors where Ru >> supersedes their own variants. > > Makes sense, as long as we cannot select the source language now. > > But, while at this, is there anything that stops are from adding > transliteration rules for additional Cyrillic characters not used in > Russian but used in other languages? > > Regards, > > Rafal > Hi, Thanks for the update. I have few mostly cosmetic comments below, hopefully we'll hear from others whether they agree with this direction. - Please add the standard glibc locale header (see the existing translit_* files for reference) - Consider wrapping the header lines at or around column 70-72 - Consider describing which characters, character ranges, or blocks are supported (perhaps also describe why some of those are not included, see e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode) - Please remove trailing whitespaces and spaces after ; - No duplicates: % CYRILLIC SMALL LETTER IE <U0435> <U0065>; <U0065> should become: % CYRILLIC SMALL LETTER IE <U0435> <U0065> - There are few issues with the definitions: % CYRILLIC CAPITAL LETTER U <U0423> <U0055>; <U0055> % CYRILLIC UNDEFINED <U0423><U0423> <U00DA>; "<U0055><U0060>" % CYRILLIC SMALL LETTER U <U0443> <U0075>; <U0075> % CYRILLIC UNDEFINED <U0443><U0443> <U00FA>; "<U0075><U0060>" I wonder would it be possible to automate generation of this file so that issues like the above could avoided? But perhaps that could be the next step once this initial patch lands. Thanks, On 2018-10-05 23:47, Egor Kobylkin wrote: > After some kind help from Marko in the offline discussion > I realized the multi/single character approach I originally took was > against the of the iconv(1) logic anyway. So there is no harm in > dropping it and adopting Marko's suggestion instead. I will do so and > will resubmit the patch with ISO 9:1995/GOST 7.79 System A + fallback to > GOST 7.79 System B (for ASCII). > > However this doesn't resolve the issue for ASCII part being different > for various locales. Again, I am offering the locale maintainers to let > me know if they want to 1) adopt the one I am supplying, 2) write their > own or 3) ignore the patch altogether. Your feedback is appreciated! > > This is the relevant part that helped: >> The first part (ISO-8859-15 or ASCII) defines the target encoding for >> iconv(1). //TRANSLIT is described in the iconv(1) man page as: >> >> If the string //TRANSLIT is appended to to-encoding, characters >> being converted are transliterated when needed and possible. This >> means that when a character cannot be represented in the target >> character set, it can be approximated through one or sev‐ eral >> similar looking characters. Characters that are outside of the >> target character set and cannot be transliterated are replaced >> with a question mark (?) in the output. >> >> So in the above examples, iconv(1) encounters the character U+0428 >> which is not part of either of the target encoding and since >> //TRANSLIT is specified, iconv(1) tries transliteration according to >> the rules defined above, in case of ASCII U+0160 is not part of the >> target encoding so the next alternative is used. > > Bests, > Egor Kobylkin > > On 05.10.2018 14:21, Marko Myllynen wrote: >> Hi, >> >> The scheme I proposed would also be ASCII compatible; consider this >> example: >> >> % CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>" >> >> "printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv >> -f ISO-8859-15 -t UTF-8" would produce Š as per System A and "printf >> \\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as >> per System B. >> >> Thanks, >> >> On 2018-10-05 15:00, Egor Kobylkin wrote: >>> Hi Marko, >>> >>> I have chosen the System B because it is ASCII compartible. System >>> A is not ASCII compartible (diacritics in target). >>> >>> https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A >>> >>> >>> > "GOST 7.79 contains two transliteration tables. >>> >>> System A one Cyrillic character to one Latin character, some with >>> diacritics – identical to ISO 9:1995 >>> >>> System B one Cyrillic character to one or many Latin characters >>> without diacritics " Hope this helps, Egor >>> >>> On 05.10.2018 13:54, Marko Myllynen wrote: >>>> Hi, >>>> >>>> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if >>>> possible and if not, then fall back to GOST 7.79 System B? >>>> >>>> Implementation-wise current translit_* files have few examples >>>> where a non-ASCII transliteration is tried first before an ASCII >>>> fallback. These examples are from translit_neutral: >>>> >>>> % NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED >>>> TRIPLE PRIME <U2037> >>>> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>" >>>> >>>> Thanks, >>>> >>>> On 2018-10-05 13:29, Egor Kobylkin wrote: >>>>> Keld,Marko,Rafal, other locale maintainers, >>>>> >>>>> this all is written with having in mind a minimal viable fix >>>>> for this bug asap. I want to avoid wasting maintainers time >>>>> getting into fundamental discussions here (although for >>>>> perfectly good reasons). >>>>> >>>>> I see three options: 1. those locale maintainers that are fine >>>>> with using ISO 9:1995/GOST_7.79_System_B cyrillic >>>>> transliteration table (Ru) include it in their locales (see >>>>> attached screenshot of the table). 2. those that that want to >>>>> have a differing table can create their own variety based on >>>>> the spreadsheet I have prepared >>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and >>>>> include it in this patch. 3. those that want to omit a >>>>> cyrillic transliteration altogether for now state so and just >>>>> carry over the bug #2872 from the year 2006. >>>>> >>>>> Does this make sense to you? >>>>> >>>>> Just to be super clear on this: the patch is a stopgap _ASCII_ >>>>> transliteration table. ASCII being AMERICAN Standard Code for >>>>> Information Interchange, that is obviously orthogonal to any >>>>> transliteration rule of other countries. As such it is not >>>>> explicitly targeting transliteration standards of any country. >>>>> >>>>> The fact that the patch is reflecting Russian variety of ISO >>>>> 9:1995/GOST_7.79_System_B is because a) ISO >>>>> 9:1995/GOST_7.79_System_B is available and can be helpful to a >>>>> majority of cyrillic users b) I have access to it including >>>>> via being proficient in Russian. >>>>> >>>>> It is offered to all the respective locale maintainers as a >>>>> stopgap solution. Stopgap in the sense that it is better to >>>>> have some transliteration than not to have any at all and >>>>> carry over the bug from 2006. That it may be a somewhat >>>>> officially correct transliteration for ru_RU is a bonus. In >>>>> that sense I would dub the discussion on the correctness for >>>>> other languages "offtopic". Let me know if this is not OK. >>>>> >>>>> You are all are correctly mentioning the deficiencies of this >>>>> approach. However, I couldn't find a better straightforward >>>>> approach as of yet. Happy to hear from you as on how this >>>>> could be handled. >>>>> >>>>> There is a danger of being caught in the web of >>>>> language/country differences. I propose just pruning the >>>>> locales that are not comfortable including this current table. >>>>> We can address possible solutions in the second wave of >>>>> patching. >>>>> >>>>> I am vary of getting into discussions on specific country >>>>> variants just because of the sheer complexity of this topic. >>>>> It is probably better addressed by respective maintainers of >>>>> their locales. I do not see a "one fits all" solution in this >>>>> first wave possible. >>>>> >>>>> I would like to have this "three options plan of action" >>>>> vetted first and then we could go to the specific detail. >>>>> (Like, for instance, what characters should be included in to >>>>> the table, and in which transliteration form.) >>>>> >>>>> I am looking forward to your reply, Egor Kobylkin >>>>> >>>>> P.S. specifically as to how address languages other than Ru >>>>> included in GOST_7.79_System_B: we can take the first option >>>>> left to right from that table (Ru,By,Uk,Bg,Mk). Then it will >>>>> technically work for all those locales/languages but with >>>>> errors where Ru supersedes their own variants. >>>>> >>>>> >>>>> On 05.10.2018 11:20, Rafal Luzynski wrote: >>>>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote: >>>>>>> >>>>>>> On 03.10.2018 11:19, Keld Simonsen wrote: >>>>>>>> Hi >>>>>>>> >>>>>>>> Please note that translitteration of Cyrillic to latin >>>>>>>> is not universal. There are different schemes for for >>>>>>>> example German, English and Danish, and there is also an >>>>>>>> ISO standard for it. >>>>>>> >>>>>>> Thanks for your feedback, Keld! >>>>>>> >>>>>>> Could the locale maintainers that wouldn't like to include >>>>>>> this patch explicitly state so here? >>>>>> >>>>>> I think it is about me so I must reply. I am sorry about >>>>>> that and the sole reason is my lack of time. I'm just a >>>>>> volunteer here, that means it's not my regular job to work >>>>>> on locale data nor anything in glibc nor in any other open >>>>>> source project. I do these things only in my free time >>>>>> which I don't have much. Of course you will see my >>>>>> contributions here and there but they are either trivial or >>>>>> take me months to complete. Your patches are on my radar but >>>>>> I can't tell any ETA for them. Of course, there are other >>>>>> people around here and they are all welcome to come and >>>>>> join. >>>>>> >>>>>>> That is: - In the case that there is a different preferred >>>>>>> cyrillic transliteration table for any specific locale >>>>>>> their maintainers may want to point me to it so I can >>>>>>> supply a separate table/patch. - Or they could state >>>>>>> explicitly that for some reason they would like to exclude >>>>>>> their locale from the patch for a default cyrillic >>>>>>> transliteration altogether. >>>>>> >>>>>> As Keld wrote, there are probably separate rules for every >>>>>> language so I don't think you should treat your rules as >>>>>> universal and include them in every locale. At first sight, >>>>>> it seems to me they work only for English (as a destination >>>>>> locale). Also, although it is called "transliteration from >>>>>> Cyrillic" it seems that it covers only Russian alphabet. What >>>>>> about other languages which use Cyrillic alphabet but add >>>>>> their own diacritic characters? Think about Belarusian, >>>>>> Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut, >>>>>> Tatar, and more. What about languages which use Cyrillic >>>>>> alphabet but transliterate their respective letters in a >>>>>> different way than Russian? For example, Russian "Ъ" is (I >>>>>> think) usually skipped in transliteration, I think you >>>>>> propose "``", but when transliterating from Bulgarian they >>>>>> usually transliterate this as "ă". >>>>>> >>>>>> Few remarks: >>>>>> >>>>>> * I think you transliterate "щ" as "shh", wouldn't "shch" be >>>>>> better? * You transliterate "ц" as "cz", wouldn't "ts" be >>>>>> better? By the way, in Polish language "cz" is a correct >>>>>> transliteration of "ч". * You transliterate "й" as "j", this >>>>>> is fine in many languages but wouldn't "y" be better in >>>>>> English? * In case of "е": how will you know if it is >>>>>> correct to transliterate it to "e" or "ie" or "je" or "ye"? >>>>>> >>>>>> These remarks are obviously incomplete, your patch deserves >>>>>> much more attention to review. >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Rafal >>>>>> >>>>> >>>> >>>> >>> >> >> >
On Mon, Oct 8, 2018 at 6:05 PM Rafal Luzynski <digitalfreak@lingonborough.com> wrote: > The problem is that we don't have a separate maintainer for each locale, > we have only 2 maintainers for about 200 locales and we must represent > them all. Sometimes a locale may happen to be our own native locale or > of someone in this list, or it may be a locale which we accidentally can > speak as a foreign language, or we may have friends who can speak it. > Or it may be totally unknown and we still must somehow handle it. I just want to mention that this is also why most of the non-locale maintainers tend to stay out of threads about locales. We know we're even less expert on these issues than you are, and I think as a general rule you should be assuming that the community is OK with what you're doing unless someone speaks up to object. zw
On 09.10.2018 00:23, Rafal Luzynski wrote: > 8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote: >> Hi, >> >> Thanks for the update. I have few mostly cosmetic comments below, >> hopefully we'll hear from others whether they agree with this direction. >> Yeah, the earlier we have feedback the more productive we are. I'd be happy to get much feedback on this as early as possible. So please everybody concerned please chime in. > >> - No duplicates: >> >> % CYRILLIC SMALL LETTER IE >> <U0435> <U0065>; <U0065> >> >> should become: >> >> % CYRILLIC SMALL LETTER IE >> <U0435> <U0065> >> >> - There are few issues with the definitions: >> >> % CYRILLIC CAPITAL LETTER U >> <U0423> <U0055>; <U0055> >> % CYRILLIC UNDEFINED >> <U0423><U0423> <U00DA>; "<U0055><U0060>" >> >> % CYRILLIC SMALL LETTER U >> <U0443> <U0075>; <U0075> >> % CYRILLIC UNDEFINED >> <U0443><U0443> <U00FA>; "<U0075><U0060>" > > Are the duplicates here because some Cyrillic letters may have multiple > Latin transliterations depending on the context, for example Cyrillic IE > must be transliterated sometimes as "e", sometimes as "ie", sometimes > as "ye" or "je"? Can we provide rules for groups of characters instead? No, the duplicates are just by design of my line generating logic. I have fixed (removed) them. The varying transcription between languages/locales can not be handled in one file at all as far as I understood. > >> I wonder would it be possible to automate generation of this file so >> that issues like the above could avoided? But perhaps that could be the >> next step once this initial patch lands. I am generating the content part of the translit_cyrillc from the LibreOffice Spreadsheet. Not sure if you had time to view it by now? https://sourceware.org/bugzilla/attachment.cgi?id=11299 Anyway I have just fixed the issues identified by Marko above in that spreadsheet. I will do the changes for the below request and then upload the new translit_cyrillic file to the bugzilla. >> - Please add the standard glibc locale header (see the existing >> translit_* files for reference) >> - Consider wrapping the header lines at or around column 70-72 >> - Consider describing which characters, character ranges, or blocks are >> supported (perhaps also describe why some of those are not included, see >> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode) >> - Please remove trailing whitespaces and spaces after ; > > Thanks for this, Marko. While at this, in the ChangeLog and in the commit > message these paths: > > * locales/aa_DJ: likewise > > 1. Should be a relative path starting in the root directory of glibc source, > that is: "* localedata/locales/aa_DJ". > 2. Should be "Likewise." (starting with an uppercase and ending with a dot). will do. Bests, Egor
Hi, I have now implemented all the changes requested for translit_cyrillic file but started hitting what seems like a bug: - If the line <U0425> <U0048>;<U0058> is present in translt_cyrillic the locale compilation fails i.e. grep CYRILLIC < $testfile | LOCPATH=$workdir/compiled_locales/"$locale"/ LC_ALL="$locale".UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT is hanging frozen. - If the line <U0425> <U0048>;<U0058> is absent from translit_cyrillic everything works, just the transliteration of <U0425> fails as expected (? is displayed) - If translit_cyrillic contains <U0425> <U0048>;<U0058> as the _only_ line the transliteration of <U0425> works again (others as ?). Would you have any idea into what direction should I look? The new translit_cyrillic is attached. (<U0425> is % CYRILLIC CAPITAL LETTER HA) Best regards, Egor On 09.10.2018 01:35, Egor Kobylkin wrote: > On 09.10.2018 00:23, Rafal Luzynski wrote: >> 8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote: >>> Hi, >>> >>> Thanks for the update. I have few mostly cosmetic comments below, >>> hopefully we'll hear from others whether they agree with this direction. >>> > > Yeah, the earlier we have feedback the more productive we are. I'd be > happy to get much feedback on this as early as possible. So please > everybody concerned please chime in. > >> >>> - No duplicates: >>> >>> % CYRILLIC SMALL LETTER IE >>> <U0435> <U0065>; <U0065> >>> >>> should become: >>> >>> % CYRILLIC SMALL LETTER IE >>> <U0435> <U0065> >>> >>> - There are few issues with the definitions: >>> >>> % CYRILLIC CAPITAL LETTER U >>> <U0423> <U0055>; <U0055> >>> % CYRILLIC UNDEFINED >>> <U0423><U0423> <U00DA>; "<U0055><U0060>" >>> >>> % CYRILLIC SMALL LETTER U >>> <U0443> <U0075>; <U0075> >>> % CYRILLIC UNDEFINED >>> <U0443><U0443> <U00FA>; "<U0075><U0060>" >> >> Are the duplicates here because some Cyrillic letters may have multiple >> Latin transliterations depending on the context, for example Cyrillic IE >> must be transliterated sometimes as "e", sometimes as "ie", sometimes >> as "ye" or "je"? Can we provide rules for groups of characters instead? > No, the duplicates are just by design of my line generating logic. I > have fixed (removed) them. The varying transcription between > languages/locales can not be handled in one file at all as far as I > understood. > >> >>> I wonder would it be possible to automate generation of this file so >>> that issues like the above could avoided? But perhaps that could be the >>> next step once this initial patch lands. > > I am generating the content part of the translit_cyrillc from the > LibreOffice Spreadsheet. Not sure if you had time to view it by now? > https://sourceware.org/bugzilla/attachment.cgi?id=11299 > > Anyway I have just fixed the issues identified by Marko above in that > spreadsheet. I will do the changes for the below request and then upload > the new translit_cyrillic file to the bugzilla. > > >>> - Please add the standard glibc locale header (see the existing >>> translit_* files for reference) >>> - Consider wrapping the header lines at or around column 70-72 >>> - Consider describing which characters, character ranges, or blocks are >>> supported (perhaps also describe why some of those are not included, see >>> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode) >>> - Please remove trailing whitespaces and spaces after ; >> >> Thanks for this, Marko. While at this, in the ChangeLog and in the commit >> message these paths: >> >> * locales/aa_DJ: likewise >> >> 1. Should be a relative path starting in the root directory of glibc > source, >> that is: "* localedata/locales/aa_DJ". >> 2. Should be "Likewise." (starting with an uppercase and ending with a > dot). > > will do. > > Bests, > Egor > escape_char / comment_char % % This file is part of the GNU C Library and contains locale data. % The Free Software Foundation does not claim any copyright interest % in the locale data contained in this file. The foregoing does not % affect the license of the GNU C Library as a whole. It does not % exempt you from the conditions of the license if your use would % otherwise be governed by that license. % Transliterations of cyrillic letters to latin and/or ascii symbols. % Inspired by ISO 9.1995 / GOST 7.79-2000. % Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf % i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995 % It implements the GOST_7.79 System A (Latin Script) as a first % option and System B Cyrillic (ASCII) as a second option. Check % https://en.wikipedia.org/wiki/ISO_9 for reference. % The System B is extended from GOST_7.79-Russian using open sources % of the transliteration mappings and the "h/`" diacritics logic. % Usage examples: % iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \ % | iconv -f ISO-8859-15 -t UTF-8 # System A % iconv -f UTF-8 -t ASCII//TRANSLIT # System B. % Contributions welcome for the rest of Cyrillic script in Unicode % https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode. % Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872. % Generated from UnicodeData.txt with % https://sourceware.org/bugzilla/attachment.cgi?id=11300. LC_CTYPE translit_start % CYRILLIC CAPITAL LETTER IO <U0401> <U00CB>;"<U0059><U004F>" % CYRILLIC CAPITAL LETTER DJE <U0402> <U0110>;"<U0044><U004A>" % CYRILLIC CAPITAL LETTER GJE <U0403> <U01F4>;"<U0047><U0060>" % CYRILLIC CAPITAL LETTER UKRAINIAN IE <U0404> <U00CA>;"<U0059><U0065>" % CYRILLIC CAPITAL LETTER DZE <U0405> <U1E90>;"<U005A><U0060>" % CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I <U0406> <U00CC>;<U0049> % CYRILLIC CAPITAL LETTER YI <U0407> <U00CF>;"<U0059><U0069>" % CYRILLIC CAPITAL LETTER JE <U0408> "<U004A><U030C>";<U004A> % CYRILLIC CAPITAL LETTER LJE <U0409> "<U004C><U0302>";"<U004C><U0060>" % CYRILLIC CAPITAL LETTER NJE <U040A> "<U004E><U0302>";"<U004E><U0060>" % CYRILLIC CAPITAL LETTER TSHE <U040B> <U0106>;"<U0054><U0053><U0048>" % CYRILLIC CAPITAL LETTER KJE <U040C> <U1E30>;"<U004B><U0060>" % CYRILLIC CAPITAL LETTER SHORT U <U040E> <U016C>;"<U0055><U0060>" % CYRILLIC CAPITAL LETTER DZHE <U040F> "<U0044><U0302>";"<U0044><U0068>" % CYRILLIC CAPITAL LETTER A <U0410> <U0041> % CYRILLIC CAPITAL LETTER BE <U0411> <U0042> % CYRILLIC CAPITAL LETTER VE <U0412> <U0056> % CYRILLIC CAPITAL LETTER GHE <U0413> <U0047> % CYRILLIC CAPITAL LETTER DE <U0414> <U0044> % CYRILLIC CAPITAL LETTER IE <U0415> <U0045> % CYRILLIC CAPITAL LETTER ZHE <U0416> <U017D>;"<U005A><U0048>" % CYRILLIC CAPITAL LETTER ZE <U0417> <U005A> % CYRILLIC CAPITAL LETTER I <U0418> <U0049> % CYRILLIC CAPITAL LETTER SHORT I <U0419> <U004A> % CYRILLIC CAPITAL LETTER KA <U041A> <U004B> % CYRILLIC CAPITAL LETTER EL <U041B> <U004C> % CYRILLIC CAPITAL LETTER EM <U041C> <U004D> % CYRILLIC CAPITAL LETTER EN <U041D> <U004E> % CYRILLIC CAPITAL LETTER O <U041E> <U004F> % CYRILLIC CAPITAL LETTER PE <U041F> <U0050> % CYRILLIC CAPITAL LETTER ER <U0420> <U0052> % CYRILLIC CAPITAL LETTER ES <U0421> <U0053> % CYRILLIC CAPITAL LETTER TE <U0422> <U0054> % CYRILLIC CAPITAL LETTER U <U0423> <U0055> % CYRILLIC UNDEFINED "<U0423><U0301>" <U00DA>;"<U0055><U0060>" % CYRILLIC CAPITAL LETTER EF <U0424> <U0046> % CYRILLIC CAPITAL LETTER HA <U0425> <U0048>;<U0058> % CYRILLIC CAPITAL LETTER TSE <U0426> <U0043>;"<U0043><U005A>" % CYRILLIC CAPITAL LETTER CHE <U0427> <U010C>;"<U0043><U0048>" % CYRILLIC CAPITAL LETTER SHA <U0428> <U0160>;"<U0053><U0048>" % CYRILLIC CAPITAL LETTER SHCHA <U0429> <U015C>;"<U0053><U0048><U0048>" % CYRILLIC CAPITAL LETTER HARD SIGN <U042A> <U02BA>;"<U0041><U0060>" % CYRILLIC CAPITAL LETTER YERU <U042B> <U0059>;"<U0059><U0060>" % CYRILLIC CAPITAL LETTER SOFT SIGN <U042C> <U02B9>;<U0060> % CYRILLIC CAPITAL LETTER E <U042D> <U00C8>;"<U0045><U0060>" % CYRILLIC CAPITAL LETTER YU <U042E> <U00DB>;"<U0059><U0055>" % CYRILLIC CAPITAL LETTER YA <U042F> <U00C2>;"<U0059><U0041>" % CYRILLIC SMALL LETTER A <U0430> <U0061> % CYRILLIC SMALL LETTER BE <U0431> <U0062> % CYRILLIC SMALL LETTER VE <U0432> <U0076> % CYRILLIC SMALL LETTER GHE <U0433> <U0067> % CYRILLIC SMALL LETTER DE <U0434> <U0064> % CYRILLIC SMALL LETTER IE <U0435> <U0065> % CYRILLIC SMALL LETTER ZHE <U0436> <U017E>;"<U007A><U0068>" % CYRILLIC SMALL LETTER ZE <U0437> <U007A> % CYRILLIC SMALL LETTER I <U0438> <U0069> % CYRILLIC SMALL LETTER SHORT I <U0439> <U006A> % CYRILLIC SMALL LETTER KA <U043A> <U006B> % CYRILLIC SMALL LETTER EL <U043B> <U006C> % CYRILLIC SMALL LETTER EM <U043C> <U006D> % CYRILLIC SMALL LETTER EN <U043D> <U006E> % CYRILLIC SMALL LETTER O <U043E> <U006F> % CYRILLIC SMALL LETTER PE <U043F> <U0070> % CYRILLIC SMALL LETTER ER <U0440> <U0072> % CYRILLIC SMALL LETTER ES <U0441> <U0073> % CYRILLIC SMALL LETTER TE <U0442> <U0074> % CYRILLIC SMALL LETTER U <U0443> <U0075> % CYRILLIC UNDEFINED "<U0443><U0301>" <U00FA>;"<U0075><U0060>" % CYRILLIC SMALL LETTER EF <U0444> <U0066> % CYRILLIC SMALL LETTER HA <U0445> <U0068>;<U0078> % CYRILLIC SMALL LETTER TSE <U0446> <U0063>;"<U0063><U007A>" % CYRILLIC SMALL LETTER CHE <U0447> <U010D>;"<U0063><U0068>" % CYRILLIC SMALL LETTER SHA <U0448> <U0161>;"<U0073><U0068>" % CYRILLIC SMALL LETTER SHCHA <U0449> <U015D>;"<U0073><U0068><U0068>" % CYRILLIC SMALL LETTER HARD SIGN <U044A> <U02BA>;"<U0060><U0060>" % CYRILLIC SMALL LETTER YERU <U044B> <U0079>;"<U0079><U0060>" % CYRILLIC SMALL LETTER SOFT SIGN <U044C> <U02B9>;<U0060> % CYRILLIC SMALL LETTER E <U044D> <U00E8>;"<U0065><U0060>" % CYRILLIC SMALL LETTER YU <U044E> <U00FB>;"<U0079><U0075>" % CYRILLIC SMALL LETTER YA <U044F> <U00E2>;"<U0079><U0061>" % CYRILLIC SMALL LETTER IO <U0451> <U00EB>;"<U0079><U006F>" % CYRILLIC SMALL LETTER DJE <U0452> <U0111>;"<U0064><U006A>" % CYRILLIC SMALL LETTER GJE <U0453> <U01F5>;"<U0067><U0060>" % CYRILLIC SMALL LETTER UKRAINIAN IE <U0454> <U00EA>;"<U0079><U0065>" % CYRILLIC SMALL LETTER DZE <U0455> <U1E91>;"<U007A><U0060>" % CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I <U0456> <U00EC>;<U0069> % CYRILLIC SMALL LETTER YI <U0457> <U00EF>;"<U0079><U0069>" % CYRILLIC SMALL LETTER JE <U0458> <U01F0>;<U006A> % CYRILLIC SMALL LETTER LJE <U0459> "<U006C><U0302>";"<U006C><U0060>" % CYRILLIC SMALL LETTER NJE <U045A> "<U006E><U0302>";"<U006E><U0060>" % CYRILLIC SMALL LETTER TSHE <U045B> <U0107>;"<U0074><U0073><U0068>" % CYRILLIC SMALL LETTER KJE <U045C> <U1E31>;"<U006B><U0060>" % CYRILLIC SMALL LETTER SHORT U <U045E> <U016D>;"<U0075><U0060>" % CYRILLIC SMALL LETTER DZHE <U045F> "<U0064><U0302>";"<U0064><U0068>" % CYRILLIC CAPITAL LETTER BIG YUS <U046A> <U01CD>;"<U004F><U0060>" % CYRILLIC SMALL LETTER BIG YUS <U046B> <U01CE>;"<U006F><U0060>" % CYRILLIC CAPITAL LETTER FITA <U0472> "<U0046><U0300>";"<U0046><U0068>" % CYRILLIC SMALL LETTER FITA <U0473> "<U0066><U0300>";"<U0066><U0068>" % CYRILLIC CAPITAL LETTER IZHITSA <U0474> <U1EF2>;"<U0059><U0068>" % CYRILLIC SMALL LETTER IZHITSA <U0475> <U1EF3>;"<U0079><U0068>" % CYRILLIC CAPITAL LETTER SEMISOFT SIGN <U048C> <U011A>;"<U0045><U0060>" % CYRILLIC SMALL LETTER SEMISOFT SIGN <U048D> <U011B>;"<U0065><U0060>" % CYRILLIC CAPITAL LETTER GHE WITH UPTURN <U0490> "<U0047><U0300>";"<U0047><U0060>" % CYRILLIC SMALL LETTER GHE WITH UPTURN <U0491> "<U0067><U0300>";"<U0067><U0060>" % CYRILLIC CAPITAL LETTER GHE WITH STROKE <U0492> <U0120>;"<U0047><U0048>" % CYRILLIC SMALL LETTER GHE WITH STROKE <U0493> <U0121>;"<U0067><U0068>" % CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK <U0494> <U011E>;"<U0047><U0048>" % CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK <U0495> <U011F>;"<U0067><U0068>" % CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER <U0496> "<U017D><U0327>";"<U005A><U0048><U0060>" % CYRILLIC SMALL LETTER ZHE WITH DESCENDER <U0497> "<U017E><U0327>";"<U007A><U0068><U0060>" % CYRILLIC CAPITAL LETTER KA WITH DESCENDER <U049A> <U0136>;"<U004B><U0060>" % CYRILLIC SMALL LETTER KA WITH DESCENDER <U049B> <U0137>;"<U006B><U0060>" % CYRILLIC CAPITAL LETTER KA WITH STROKE <U049E> "<U004B><U0304>";"<U004B><U0060>" % CYRILLIC SMALL LETTER KA WITH STROKE <U049F> "<U006B><U0304>";"<U006B><U0060>" % CYRILLIC CAPITAL LETTER EN WITH DESCENDER <U04A2> <U1E46>;"<U004E><U0060>" % CYRILLIC SMALL LETTER EN WITH DESCENDER <U04A3> <U1E47>;"<U006E><U0060>" % CYRILLIC CAPITAL LIGATURE EN GHE <U04A4> <U1E44>;"<U004E><U0047>" % CYRILLIC SMALL LIGATURE EN GHE <U04A5> <U1E45>;"<U006E><U0067>" % CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK <U04A6> <U1E54>;"<U0050><U0060>" % CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK <U04A7> <U1E55>;"<U0070><U0060>" % CYRILLIC CAPITAL LETTER ABKHASIAN HA <U04A8> <U00D2>;"<U004F><U0060>" % CYRILLIC SMALL LETTER ABKHASIAN HA <U04A9> <U00F2>;"<U006F><U0060>" % CYRILLIC CAPITAL LETTER ES WITH DESCENDER <U04AA> <U00C7>;"<U0043><U0060>" % CYRILLIC SMALL LETTER ES WITH DESCENDER <U04AB> <U00E7>;"<U0043><U0060>" % CYRILLIC CAPITAL LETTER TE WITH DESCENDER <U04AC> <U0162>;"<U0054><U0060>" % CYRILLIC SMALL LETTER TE WITH DESCENDER <U04AD> <U0163>;"<U0074><U0060>" % CYRILLIC CAPITAL LETTER STRAIGHT U <U04AE> <U00D9>;<U0055> % CYRILLIC SMALL LETTER STRAIGHT U <U04AF> <U00F9>;<U0075> % CYRILLIC CAPITAL LETTER HA WITH DESCENDER <U04B2> <U1E28>;"<U0048><U0060>" % CYRILLIC SMALL LETTER HA WITH DESCENDER <U04B3> <U1E29>;"<U0068><U0060>" % CYRILLIC CAPITAL LIGATURE TE TSE <U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>" % CYRILLIC SMALL LIGATURE TE TSE <U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>" % CYRILLIC CAPITAL LETTER SHHA <U04BA> <U1E24>;"<U0053><U0048><U0060>" % CYRILLIC SMALL LETTER SHHA <U04BB> <U1E25>;"<U0053><U0048><U0060>" % CYRILLIC CAPITAL LETTER ABKHASIAN CHE <U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>" % CYRILLIC SMALL LETTER ABKHASIAN CHE <U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>" % CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER <U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>" % CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER <U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>" % CYRILLIC LETTER PALOCHKA <U04C0> <U2021>;<U0069> % CYRILLIC CAPITAL LETTER ZHE WITH BREVE <U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>" % CYRILLIC SMALL LETTER ZHE WITH BREVE <U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>" % CYRILLIC CAPITAL LETTER KHAKASSIAN CHE <U04CB> <U00C7>;"<U0043><U0048><U0060>" % CYRILLIC SMALL LETTER KHAKASSIAN CHE <U04CC> <U00E7>;"<U0063><U0068><U0060>" % CYRILLIC CAPITAL LETTER A WITH BREVE <U04D0> <U0102>;"<U0041><U0060>" % CYRILLIC SMALL LETTER A WITH BREVE <U04D1> <U0103>;"<U0061><U0060>" % CYRILLIC CAPITAL LETTER A WITH DIAERESIS <U04D2> <U00C4>;"<U0041><U0060>" % CYRILLIC SMALL LETTER A WITH DIAERESIS <U04D3> <U00E4>;"<U0061><U0060>" % CYRILLIC CAPITAL LETTER IE WITH BREVE <U04D6> <U0114>;"<U0045><U0060>" % CYRILLIC SMALL LETTER IE WITH BREVE <U04D7> <U0115>;"<U0065><U0060>" % CYRILLIC CAPITAL LETTER SCHWA <U04D8> "<U0041><U030B>";"<U0041><U0060>" % CYRILLIC SMALL LETTER SCHWA <U04D9> "<U0061><U030B>";"<U0061><U0060>" % CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS <U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>" % CYRILLIC SMALL LETTER ZHE WITH DIAERESIS <U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>" % CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS <U04DE> "<U005A><U0308>";"<U005A><U0060>" % CYRILLIC SMALL LETTER ZE WITH DIAERESIS <U04DF> "<U007A><U0308>";"<U007A><U0060>" % CYRILLIC CAPITAL LETTER ABKHASIAN DZE <U04E0> <U0179>;"<U005A><U0060>" % CYRILLIC SMALL LETTER ABKHASIAN DZE <U04E1> <U017A>;"<U007A><U0060>" % CYRILLIC CAPITAL LETTER I WITH DIAERESIS <U04E4> <U00CE>;"<U0049><U0060>" % CYRILLIC SMALL LETTER I WITH DIAERESIS <U04E5> <U00EE>;"<U0069><U0060>" % CYRILLIC CAPITAL LETTER O WITH DIAERESIS <U04E6> <U00D6>;"<U004F><U0060>" % CYRILLIC SMALL LETTER O WITH DIAERESIS <U04E7> <U00F6>;"<U006F><U0060>" % CYRILLIC CAPITAL LETTER BARRED O <U04E8> <U00D4>;"<U004F><U0060>" % CYRILLIC SMALL LETTER BARRED O <U04E9> <U00F4>;"<U006F><U0060>" % CYRILLIC CAPITAL LETTER U WITH DIAERESIS <U04F0> <U00DC>;"<U0055><U0060>" % CYRILLIC SMALL LETTER U WITH DIAERESIS <U04F1> <U00FC>;"<U0075><U0060>" % CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE <U04F2> <U0170>;"<U0055><U0060>" % CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE <U04F3> <U0171>;"<U0075><U0060>" % CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS <U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>" % CYRILLIC SMALL LETTER CHE WITH DIAERESIS <U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>" % CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS <U04F8> <U0178>;"<U0059><U0060>" % CYRILLIC SMALL LETTER YERU WITH DIAERESIS <U04F9> <U00FF>;"<U0079><U0060>" % RIGHT SINGLE QUOTATION MARK <U2019> <U2035>;<U0027> translit_end END LC_CTYPE
On 10/8/18 7:20 PM, Zack Weinberg wrote: > On Mon, Oct 8, 2018 at 6:05 PM Rafal Luzynski > <digitalfreak@lingonborough.com> wrote: >> The problem is that we don't have a separate maintainer for each locale, >> we have only 2 maintainers for about 200 locales and we must represent >> them all. Sometimes a locale may happen to be our own native locale or >> of someone in this list, or it may be a locale which we accidentally can >> speak as a foreign language, or we may have friends who can speak it. >> Or it may be totally unknown and we still must somehow handle it. > > I just want to mention that this is also why most of the non-locale > maintainers tend to stay out of threads about locales. We know we're > even less expert on these issues than you are, and I think as a > general rule you should be assuming that the community is OK with what > you're doing unless someone speaks up to object. I agree with Zach here. Rafal and Mike are localedata subsystem maintainers, and your best efforts are the best we have right now in the community. I also agree that a conservative position of is always a good place to start, but it sounds like Egor has added enough coverage to perhaps make all of these transliterations opt-in by default. I don't have a good sense of this though, and so I defer to you as a the subsystem maintainer to review and formulate a position. If you have any specific questions, I can certainly help review.
Hi, On 2018-10-09 01:04, Rafal Luzynski wrote: > > Particularly, I think that those rules will not be helpful at all for > the languages which use neither Latin nor Cyrillic alphabet. This is certainly a very good point. > If you refer to other languages than Russian which also use the Cyrillic > alphabet but need a different transliteration rules than Russian for > the same characters then it is OK for me now. I am afraid that the iconv > algorithm does not handle such case. Of course, we should add this missing > feature eventually but I do not volunteer to do it now. Yes, this would be needed for correct transliteration of different languages, and this might be quite a bit of work. There's also the case of transliteration and character sets, consider the transliteration examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus: Russian: Борис Николаевич Ельцин Int'l: Boris Nikolaevič Elʹcin Finnish: Boris Nikolajevitš Jeltsin French: Boris Nikolaïevitch Ieltsine Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn] For French you'll get the correct transliteration with iconv by using -t ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's not so obvious how to get the above kind transliteration for ISO 9 international or especially for the phonetic case. One thing that might be helpful here could be something like: $ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE ž That is, force transliteration of each character (if defined) even if it's part of the target character set. AFAICS this is not currently possible. > But, while at this, is there anything that stops are from adding transliteration > rules for additional Cyrillic characters not used in Russian but used in > other languages? This would probably make sense. FWIW, for Finnish the diff for Russian to be applied in the locale on top of translit_cyrillic (ISO 9) rules would be something like below, I still need to check whether there are rules needed for other languages than Russian that could be added (I hope to submit a proper patch against fi_FI shortly after translit_cyrillic has landed): <U0446> "<U0074><U0073>" <U0447> "<U0074><U0161>";"<U0074><U0073><U0068>" <U0448> "<U0161>";"<U0073><U0068>" <U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>" <U044A> "" <U044C> "" <U044D> "<U0065>" <U044E> "<U006A><U0075>" <U044F> "<U006A><U0061>" <U0451> "<U006A><U006F>" Thanks,
In the hope to be helpful: what you describe below from https://fi.wikipedia.org/wiki/Siirtokirjoitus is called _transcription_, not transliteration. Transliteration is what we have done with ISO 9 or GOST 7.79 System A and it could be the same for all languages indeed. The transcription can be phonetic or serve other purposes and depends on the target language or use case. We have used the GOST 7.79 System B. Egor On 09.10.2018 18:10, Marko Myllynen wrote: > Hi, > > On 2018-10-09 01:04, Rafal Luzynski wrote: >> >> Particularly, I think that those rules will not be helpful at all for >> the languages which use neither Latin nor Cyrillic alphabet. > > This is certainly a very good point. > >> If you refer to other languages than Russian which also use the Cyrillic >> alphabet but need a different transliteration rules than Russian for >> the same characters then it is OK for me now. I am afraid that the iconv >> algorithm does not handle such case. Of course, we should add this missing >> feature eventually but I do not volunteer to do it now. > > Yes, this would be needed for correct transliteration of different > languages, and this might be quite a bit of work. There's also the case > of transliteration and character sets, consider the transliteration > examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus: > > Russian: Борис Николаевич Ельцин > Int'l: Boris Nikolaevič Elʹcin > Finnish: Boris Nikolajevitš Jeltsin > French: Boris Nikolaïevitch Ieltsine > Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn] > > For French you'll get the correct transliteration with iconv by using -t > ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's > not so obvious how to get the above kind transliteration for ISO 9 > international or especially for the phonetic case. > > One thing that might be helpful here could be something like: > > $ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE > ž > > That is, force transliteration of each character (if defined) even if > it's part of the target character set. AFAICS this is not currently > possible. > >> But, while at this, is there anything that stops are from adding transliteration >> rules for additional Cyrillic characters not used in Russian but used in >> other languages? > > This would probably make sense. > > FWIW, for Finnish the diff for Russian to be applied in the locale on > top of translit_cyrillic (ISO 9) rules would be something like below, I > still need to check whether there are rules needed for other languages > than Russian that could be added (I hope to submit a proper patch > against fi_FI shortly after translit_cyrillic has landed): > > <U0446> "<U0074><U0073>" > <U0447> "<U0074><U0161>";"<U0074><U0073><U0068>" > <U0448> "<U0161>";"<U0073><U0068>" > <U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>" > <U044A> "" > <U044C> "" > <U044D> "<U0065>" > <U044E> "<U006A><U0075>" > <U044F> "<U006A><U0061>" > <U0451> "<U006A><U006F>" > > Thanks, >
Hi, To clarify, the page has a section explaining the differences between transliteration and transcription and how the terminology is not entirely unambiguous. It also explains that the national standard SFS 4900 overrides ISO 9, thus ISO 9 can't be used as-is in Finnish context. Thanks, On 2018-10-09 19:22, Egor Kobylkin wrote: > In the hope to be helpful: what you describe below from > https://fi.wikipedia.org/wiki/Siirtokirjoitus is called _transcription_, > not transliteration. > > Transliteration is what we have done with ISO 9 or GOST 7.79 System A > and it could be the same for all languages indeed. > > The transcription can be phonetic or serve other purposes and depends on > the target language or use case. We have used the GOST 7.79 System B. > > Egor > > On 09.10.2018 18:10, Marko Myllynen wrote: >> Hi, >> >> On 2018-10-09 01:04, Rafal Luzynski wrote: >>> >>> Particularly, I think that those rules will not be helpful at all for >>> the languages which use neither Latin nor Cyrillic alphabet. >> >> This is certainly a very good point. >> >>> If you refer to other languages than Russian which also use the Cyrillic >>> alphabet but need a different transliteration rules than Russian for >>> the same characters then it is OK for me now. I am afraid that the iconv >>> algorithm does not handle such case. Of course, we should add this missing >>> feature eventually but I do not volunteer to do it now. >> >> Yes, this would be needed for correct transliteration of different >> languages, and this might be quite a bit of work. There's also the case >> of transliteration and character sets, consider the transliteration >> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus: >> >> Russian: Борис Николаевич Ельцин >> Int'l: Boris Nikolaevič Elʹcin >> Finnish: Boris Nikolajevitš Jeltsin >> French: Boris Nikolaïevitch Ieltsine >> Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn] >> >> For French you'll get the correct transliteration with iconv by using -t >> ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's >> not so obvious how to get the above kind transliteration for ISO 9 >> international or especially for the phonetic case. >> >> One thing that might be helpful here could be something like: >> >> $ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE >> ž >> >> That is, force transliteration of each character (if defined) even if >> it's part of the target character set. AFAICS this is not currently >> possible. >> >>> But, while at this, is there anything that stops are from adding transliteration >>> rules for additional Cyrillic characters not used in Russian but used in >>> other languages? >> >> This would probably make sense. >> >> FWIW, for Finnish the diff for Russian to be applied in the locale on >> top of translit_cyrillic (ISO 9) rules would be something like below, I >> still need to check whether there are rules needed for other languages >> than Russian that could be added (I hope to submit a proper patch >> against fi_FI shortly after translit_cyrillic has landed): >> >> <U0446> "<U0074><U0073>" >> <U0447> "<U0074><U0161>";"<U0074><U0073><U0068>" >> <U0448> "<U0161>";"<U0073><U0068>" >> <U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>" >> <U044A> "" >> <U044C> "" >> <U044D> "<U0065>" >> <U044E> "<U006A><U0075>" >> <U044F> "<U006A><U0061>" >> <U0451> "<U006A><U006F>" >> >> Thanks, >> >
The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and "<U0443><U0301>" (<U00FA>). It works now with % CYRILLIC UNDEFINED <U0423><U0301> <U00DA>;"<U0055><U0060>" % CYRILLIC UNDEFINED <U0443><U0301> <U00FA>;"<U0075><U0060>" The <U0301> is "combining" and obviously it doesn't work if enclosed in quotes with the letter codepoint. Please let me know if there is another explanation. I will now make those changes and generate the patch itself. Egor On 09.10.2018 15:18, Egor Kobylkin wrote: > Hi, > > I have now implemented all the changes requested for translit_cyrillic > file but started hitting what seems like a bug: > > - If the line <U0425> <U0048>;<U0058> is present in translt_cyrillic the > locale compilation fails i.e. grep CYRILLIC < $testfile | > LOCPATH=$workdir/compiled_locales/"$locale"/ LC_ALL="$locale".UTF-8 > iconv -f UTF-8 -t ASCII//TRANSLIT is hanging frozen. > > - If the line <U0425> <U0048>;<U0058> is absent from translit_cyrillic > everything works, just the transliteration of <U0425> fails as expected > (? is displayed) > > - If translit_cyrillic contains <U0425> <U0048>;<U0058> as the _only_ > line the transliteration of <U0425> works again (others as ?). > > Would you have any idea into what direction should I look? The new > translit_cyrillic is attached. > > (<U0425> is % CYRILLIC CAPITAL LETTER HA) > > Best regards, > Egor > > On 09.10.2018 01:35, Egor Kobylkin wrote: >> On 09.10.2018 00:23, Rafal Luzynski wrote: >>> 8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote: >>>> Hi, >>>> >>>> Thanks for the update. I have few mostly cosmetic comments below, >>>> hopefully we'll hear from others whether they agree with this direction. >>>> >> >> Yeah, the earlier we have feedback the more productive we are. I'd be >> happy to get much feedback on this as early as possible. So please >> everybody concerned please chime in. >> >>> >>>> - No duplicates: >>>> >>>> % CYRILLIC SMALL LETTER IE >>>> <U0435> <U0065>; <U0065> >>>> >>>> should become: >>>> >>>> % CYRILLIC SMALL LETTER IE >>>> <U0435> <U0065> >>>> >>>> - There are few issues with the definitions: >>>> >>>> % CYRILLIC CAPITAL LETTER U >>>> <U0423> <U0055>; <U0055> >>>> % CYRILLIC UNDEFINED >>>> <U0423><U0423> <U00DA>; "<U0055><U0060>" >>>> >>>> % CYRILLIC SMALL LETTER U >>>> <U0443> <U0075>; <U0075> >>>> % CYRILLIC UNDEFINED >>>> <U0443><U0443> <U00FA>; "<U0075><U0060>" >>> >>> Are the duplicates here because some Cyrillic letters may have multiple >>> Latin transliterations depending on the context, for example Cyrillic IE >>> must be transliterated sometimes as "e", sometimes as "ie", sometimes >>> as "ye" or "je"? Can we provide rules for groups of characters instead? >> No, the duplicates are just by design of my line generating logic. I >> have fixed (removed) them. The varying transcription between >> languages/locales can not be handled in one file at all as far as I >> understood. >> >>> >>>> I wonder would it be possible to automate generation of this file so >>>> that issues like the above could avoided? But perhaps that could be the >>>> next step once this initial patch lands. >> >> I am generating the content part of the translit_cyrillc from the >> LibreOffice Spreadsheet. Not sure if you had time to view it by now? >> https://sourceware.org/bugzilla/attachment.cgi?id=11299 >> >> Anyway I have just fixed the issues identified by Marko above in that >> spreadsheet. I will do the changes for the below request and then upload >> the new translit_cyrillic file to the bugzilla. >> >> >>>> - Please add the standard glibc locale header (see the existing >>>> translit_* files for reference) >>>> - Consider wrapping the header lines at or around column 70-72 >>>> - Consider describing which characters, character ranges, or blocks are >>>> supported (perhaps also describe why some of those are not included, see >>>> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode) >>>> - Please remove trailing whitespaces and spaces after ; >>> >>> Thanks for this, Marko. While at this, in the ChangeLog and in the commit >>> message these paths: >>> >>> * locales/aa_DJ: likewise >>> >>> 1. Should be a relative path starting in the root directory of glibc >> source, >>> that is: "* localedata/locales/aa_DJ". >>> 2. Should be "Likewise." (starting with an uppercase and ending with a >> dot). >> >> will do. >> >> Bests, >> Egor >> >
9.10.2018 00:52 Egor Kobylkin <egor@kobylkin.com> wrote: > [...] > Just to make sure we are not talking at cross purposes. Since your last > email on this topic on the suggestion from Marko I have already > implemented ISO 9 transliteration for all characters there are. This > should cover most if not all Slavic Cyrillic. You seem to have just > noticed and replied to this email of Marko as I write mine. That's great. I'm sorry about not noticing this before, as you can see this only confirms that I'm unable to give a proper attention to your bug. 9.10.2018 01:35 Egor Kobylkin <egor@kobylkin.com> wrote: > On 09.10.2018 00:23, Rafal Luzynski wrote: > > Are the duplicates here because some Cyrillic letters may have multiple > > Latin transliterations depending on the context, for example Cyrillic IE > > must be transliterated sometimes as "e", sometimes as "ie", sometimes > > as "ye" or "je"? Can we provide rules for groups of characters instead? > > No, the duplicates are just by design of my line generating logic. I > have fixed (removed) them. The varying transcription between > languages/locales can not be handled in one file at all as far as I > understood. No, I did not mean here different languages but that some letters may need to be transliterated in a different way depending on the context. For example, a letter "е" might be transliterated as "e" or "ie" or "je" depending on whether it appears after "ж" or after another consonant or after a vowel or a soft or hard sign etc. All within Russian language. (Sorry if I'm messing that, maybe what I wrote is wrong but may be correct for another combination of letters.) Regards, Rafal
9.10.2018 17:26 Carlos O'Donell <carlos@redhat.com> wrote: > [...] > but it sounds like Egor has added enough coverage to perhaps make all of > these transliterations opt-in by default. I think that it is correct if this transliteration is meant to be "Russian language as if it used a Latin alphabet (even if it does not actually except in some computer systems which do not support Cyrillic)" but not if it is meant to be "Russian language to make sure it is comfortable for reading by English speakers (assuming that everyone else should be fine with English if their native language is not supported)". Regards, Rafal
9.10.2018 18:10 Marko Myllynen <myllynen@redhat.com> wrote: > On 2018-10-09 01:04, Rafal Luzynski wrote: > > If you refer to other languages than Russian which also use the Cyrillic > > alphabet but need a different transliteration rules than Russian for > > the same characters then it is OK for me now. I am afraid that the iconv > > algorithm does not handle such case. Of course, we should add this missing > > feature eventually but I do not volunteer to do it now. > > Yes, this would be needed for correct transliteration of different > languages, and this might be quite a bit of work. There's also the case > of transliteration and character sets, consider the transliteration > examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus: > > Russian: Борис Николаевич Ельцин > Int'l: Boris Nikolaevič Elʹcin > Finnish: Boris Nikolajevitš Jeltsin > French: Boris Nikolaïevitch Ieltsine > Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn] No, I did not mean the transcription using the rules of the destination locale using Latin but that the rules of transliteration may be different depending on the language of the source text. For example, consider this Cyrillic string: "нъг" (I'm not telling that it is actually used in any existing word but still must be handled). By our transliteration rules it will be transliterated as "n``g". But this is fine for Russian; if we knew that the source string is Ukrainian it would be transliterated as "n``h"; if it was Bulgarian it would be transliterated as "năg". Similarly, if you had to transliterate the Latin letters "sch" to Cyrillic first you would have to ask what was be the source language. Unfortunately, I think that distinction of the source language is impossible at the moment so let's assume that we fall back to Russian if there is any ambiguity. Regards, Rafal
9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote: > > The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and > "<U0443><U0301>" (<U00FA>). > It works now with > % CYRILLIC UNDEFINED > <U0423><U0301> <U00DA>;"<U0055><U0060>" > % CYRILLIC UNDEFINED > <U0443><U0301> <U00FA>;"<U0075><U0060>" > > [...] I wonder why you need Cyrillic U with acute, and why you comment it as "undefined" at all. I know that any Cyrillic vowel may appear with an acute accent but "the diacritic is used only in dictionaries, children's books, resources for foreign-language learners (...)". [1] So maybe all vowels with an acute accent should be handled (which I think is fine) rather than just U. Regards, Rafal [1] https://en.wikipedia.org/wiki/Russian_alphabet#Diacritics
On 10.10.2018 00:17, Rafal Luzynski wrote: > 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote: >> >> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and >> "<U0443><U0301>" (<U00FA>). >> It works now with >> % CYRILLIC UNDEFINED >> <U0423><U0301> <U00DA>;"<U0055><U0060>" >> % CYRILLIC UNDEFINED >> <U0443><U0301> <U00FA>;"<U0075><U0060>" >> >> [...] > > I wonder why you need Cyrillic U with acute, and why you comment it > as "undefined" at all. I know that any Cyrillic vowel may appear with > an acute accent but "the diacritic is used only in dictionaries, children's > books, resources for foreign-language learners (...)". [1] So maybe > all vowels with an acute accent should be handled (which I think is fine) > rather than just U. I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and implemented it on Marko's suggestion. Personally I have no opinion on what letters should be included and under what name. These funny Us just happened to be in the ISO9 table. There is no codepoint and no name for <U0423><U0301> and <U0443><U0301> in Unicode. That’s why its coming through that way from my worksheet as it does a reverse lookup on the names based on the Unicode codepoints. Manually we can change it to whatever you’d suggest in the translit_cyrillic. I just don’t know the right name. On my side I think I have all outstanding tasks complete for the patch https://sourceware.org/bugzilla/attachment.cgi?id=11144. So please let me know explicitly if you'd like anything changed there. I was planning to rewrite just the commit message according to your earlier feedback and resubmit sometime soon. Bests, Diego
Ups, sorry, wrong link to the patch correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303 On 10.10.2018 00:40, Egor Kobylkin wrote: > On 10.10.2018 00:17, Rafal Luzynski wrote: >> 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote: >>> >>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and >>> "<U0443><U0301>" (<U00FA>). >>> It works now with >>> % CYRILLIC UNDEFINED >>> <U0423><U0301> <U00DA>;"<U0055><U0060>" >>> % CYRILLIC UNDEFINED >>> <U0443><U0301> <U00FA>;"<U0075><U0060>" >>> >>> [...] >> >> I wonder why you need Cyrillic U with acute, and why you comment it >> as "undefined" at all. I know that any Cyrillic vowel may appear with >> an acute accent but "the diacritic is used only in dictionaries, children's >> books, resources for foreign-language learners (...)". [1] So maybe >> all vowels with an acute accent should be handled (which I think is fine) >> rather than just U. > > I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and > implemented it on Marko's suggestion. Personally I have no opinion on > what letters should be included and under what name. These funny Us just > happened to be in the ISO9 table. > > There is no codepoint and no name for <U0423><U0301> and <U0443><U0301> > in Unicode. That’s why its coming through that way from my worksheet as > it does a reverse lookup on the names based on the Unicode codepoints. > > Manually we can change it to whatever you’d suggest in the > translit_cyrillic. I just don’t know the right name. > > On my side I think I have all outstanding tasks complete for the patch > https://sourceware.org/bugzilla/attachment.cgi?id=11144. So please let > me know explicitly if you'd like anything changed there. correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303 > > I was planning to rewrite just the commit message according to your > earlier feedback and resubmit sometime soon. > Bests, Egor
Hi, On 2018-10-10 01:08, Rafal Luzynski wrote: > 9.10.2018 18:10 Marko Myllynen <myllynen@redhat.com> wrote: >> On 2018-10-09 01:04, Rafal Luzynski wrote: >>> If you refer to other languages than Russian which also use the Cyrillic >>> alphabet but need a different transliteration rules than Russian for >>> the same characters then it is OK for me now. I am afraid that the iconv >>> algorithm does not handle such case. Of course, we should add this missing >>> feature eventually but I do not volunteer to do it now. >> >> Yes, this would be needed for correct transliteration of different >> languages, and this might be quite a bit of work. There's also the case >> of transliteration and character sets, consider the transliteration >> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus: >> >> Russian: Борис Николаевич Ельцин >> Int'l: Boris Nikolaevič Elʹcin >> Finnish: Boris Nikolajevitš Jeltsin >> French: Boris Nikolaïevitch Ieltsine >> Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn] > > No, I did not mean the transcription using the rules of the destination > locale using Latin but that the rules of transliteration may be different > depending on the language of the source text. Yes, I mentioned this case in my earlier email: https://sourceware.org/ml/libc-alpha/2018-10/msg00083.html > this Cyrillic string: "нъг" (I'm not telling that it is actually used > in any existing word but still must be handled). By our transliteration > rules it will be transliterated as "n``g". But this is fine for Russian; > if we knew that the source string is Ukrainian it would be transliterated > as "n``h"; if it was Bulgarian it would be transliterated as "năg". And according to SFS 4900, in fi_FI for this string we would see for Russian ng, for Ukrainian nh, and for Bulgarian năg. > Unfortunately, I think that distinction of the source language is impossible > at the moment so let's assume that we fall back to Russian if there is > any ambiguity. Yeah, it's not optimal but probably the most decent compromise for now. Thanks,
Hi, On 2018-10-10 01:42, Egor Kobylkin wrote: > Ups, sorry, wrong link to the patch > correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303 Although I haven't checked every rule this in general looks very good (but see below). Not sure do we want to add the few missing characters mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode, e.g., one instantly notices that U+0400 is missing. (I wouldn't add at least initially the more exotic characters, like the historic ones, though.) Perhaps filing a bug or two for these cases for separate consideration would be ok. > On 10.10.2018 00:40, Egor Kobylkin wrote: >> On 10.10.2018 00:17, Rafal Luzynski wrote: >>> 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote: >>>> >>>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and >>>> "<U0443><U0301>" (<U00FA>). >>>> It works now with >>>> % CYRILLIC UNDEFINED >>>> <U0423><U0301> <U00DA>;"<U0055><U0060>" >>>> % CYRILLIC UNDEFINED >>>> <U0443><U0301> <U00FA>;"<U0075><U0060>" >>>> >>>> [...] >>> >>> I wonder why you need Cyrillic U with acute, and why you comment it >>> as "undefined" at all. I know that any Cyrillic vowel may appear with >>> an acute accent but "the diacritic is used only in dictionaries, children's >>> books, resources for foreign-language learners (...)". [1] So maybe >>> all vowels with an acute accent should be handled (which I think is fine) >>> rather than just U. >> >> I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and >> implemented it on Marko's suggestion. Personally I have no opinion on >> what letters should be included and under what name. These funny Us just >> happened to be in the ISO9 table. >> >> There is no codepoint and no name for <U0423><U0301> and <U0443><U0301> >> in Unicode. That’s why its coming through that way from my worksheet as >> it does a reverse lookup on the names based on the Unicode codepoints. >> >> Manually we can change it to whatever you’d suggest in the >> translit_cyrillic. I just don’t know the right name. I'm not sure this will work, no existing rule in translit_* files contain two characters, I'd assume that the rule for U+0423 is applied first and then the below rule is never used. % CYRILLIC UNDEFINED <U0423><U0301> <U00DA>;"<U0055><U0060>" Perhaps this should be commented out or removed altogether if it's not working as intended. Thanks,
On 10.10.2018 13:22, Marko Myllynen wrote: >> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303 > > Although I haven't checked every rule this in general looks very good > (but see below). > Not sure do we want to add the few missing characters > mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode, > e.g., one instantly notices that U+0400 is missing. (I wouldn't add at > least initially the more exotic characters, like the historic ones, > though.) Perhaps filing a bug or two for these cases for separate > consideration would be ok. The question here is what should serve as their transliteration and transcription? They are not covered by ISO9 neither by GOST 7.79. So maybe it would be reasonable to assume there is no notable occurrence of those anywhere? Anyway I am happy to include your specific suggestions for all and any Unicode quartets in this form: [Cyrillic Unicode ; ISO9 Latin Transliteration (System A) as Unicode ; Transcription (System B) as (mulitcharacter)ASCII ; name to put in %COMMENT ]. > >> On 10.10.2018 00:40, Egor Kobylkin wrote: >>> On 10.10.2018 00:17, Rafal Luzynski wrote: >>>> 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote: >>>>> >>>>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and >>>>> "<U0443><U0301>" (<U00FA>). >>>>> It works now with >>>>> % CYRILLIC UNDEFINED >>>>> <U0423><U0301> <U00DA>;"<U0055><U0060>" >>>>> % CYRILLIC UNDEFINED >>>>> <U0443><U0301> <U00FA>;"<U0075><U0060>" >>>>> >>>>> [...] >>>> >>>> I wonder why you need Cyrillic U with acute, and why you comment it >>>> as "undefined" at all. I know that any Cyrillic vowel may appear with >>>> an acute accent but "the diacritic is used only in dictionaries, children's >>>> books, resources for foreign-language learners (...)". [1] So maybe >>>> all vowels with an acute accent should be handled (which I think is fine) >>>> rather than just U. >>> >>> I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and >>> implemented it on Marko's suggestion. Personally I have no opinion on >>> what letters should be included and under what name. These funny Us just >>> happened to be in the ISO9 table. >>> >>> There is no codepoint and no name for <U0423><U0301> and <U0443><U0301> >>> in Unicode. That’s why its coming through that way from my worksheet as >>> it does a reverse lookup on the names based on the Unicode codepoints. >>> >>> Manually we can change it to whatever you’d suggest in the >>> translit_cyrillic. I just don’t know the right name. > > I'm not sure this will work, no existing rule in translit_* files > contain two characters, I'd assume that the rule for U+0423 is applied > first and then the below rule is never used. > > % CYRILLIC UNDEFINED > <U0423><U0301> <U00DA>;"<U0055><U0060>" > > Perhaps this should be commented out or removed altogether if it's not > working as intended. here is a result of my test on https://sourceware.org/bugzilla/attachment.cgi?id=11304 U0423 0301-У́ -> U0423 0301-U U0443 0301-у́ -> U0443 0301-u So yes, they are not processed. I would drop them to not to have special cases. But I am also fine with keeping them because all work is done already. Result: CYRILLIC RUSSIAN S``esh` eshhyo e`tih myagkih francuzskih bulok, da vypej zhe chayu. SA`ESH` ESHHYO E`TIH MYAGKIH FRANCUZSKIH BULOK? DA VYPEJ ZHE CHAYU! CYRILLIC COMPLETE U0401-YO U0402-DJ U0403-G` U0404-Ye U0405-Z` U0406-I U0407-Yi U0408-J U0409-L` U040A-N` U040B-TSH U040C-K` U040E-U` U040F-Dh U0410-A U0411-B U0412-V U0413-G U0414-D U0415-E U0416-ZH U0417-Z U0418-I U0419-J U041A-K U041B-L U041C-M U041D-N U041E-O U041F-P U0420-R U0421-S U0422-T U0423-U U0423 0301-U U0424-F U0425-H U0426-C U0427-CH U0428-SH U0429-SHH U042A-`` U042B-Y U042C-` U042D-E` U042E-YU U042F-YA U0430-a U0431-b U0432-v U0433-g U0434-d U0435-e U0436-zh U0437-z U0438-i U0439-j U043A-k U043B-l U043C-m U043D-n U043E-o U043F-p U0440-r U0441-s U0442-t U0443-u U0443 0301-u U0444-f U0445-h U0446-c U0447-ch U0448-sh U0449-shh U044A-A` U044B-y U044C-` U044D-e` U044E-yu U044F-ya U0451-yo U0452-dj U0453-g` U0454-ye U0455-z` U0456-i U0457-yi U0458-j U0459-l` U045A-n` U045B-tsh U045C-k` U045E-u` U045F-dh U046A-O` U046B-o` U0472-Fh U0473-fh U0474-Yh U0475-yh U048C-E` U048D-e` U0490-G` U0491-g` U0492-GH U0493-gh U0494-GH U0495-gh U0496-ZH` U0497-zh` U049A-K` U049B-k` U049E-K` U049F-k` U04A2-N` U04A3-n` U04A4-NG U04A5-ng U04A6-P` U04A7-p` U04A8-O` U04A9-o` U04AA-C` U04AB-C` U04AC-T` U04AD-t` U04AE-U U04AF-u U04B2-H` U04B3-h` U04B4-TCZ U04B5-tcz U04BA-SH` U04BB-SH` U04BC-CH` U04BD-ch` U04BE-CH` U04BF-ch` U04C0-i U04C1-ZH` U04C2-zh` U04CB-CH` U04CC-ch` U04D0-A` U04D1-a` U04D2-A` U04D3-a` U04D6-E` U04D7-e` U04D8-A` U04D9-a` U04DC-ZH` U04DD-zh` U04DE-Z` U04DF-z` U04E0-Z` U04E1-z` U04E4-I` U04E5-i` U04E6-O` U04E7-o` U04E8-O` U04E9-o` U04F0-U` U04F1-u` U04F2-U` U04F3-u` U04F4-CH` U04F5-ch` U04F8-Y` U04F9-y` U2019-' Source: CYRILLIC RUSSIAN Съешь ещё этих мягких французских булок, да выпей же чаю. СЪЕШЬ ЕЩЁ ЭТИХ МЯГКИХ ФРАНЦУЗСКИХ БУЛОК? ДА ВЫПЕЙ ЖЕ ЧАЮ! CYRILLIC COMPLETE U0401-Ё U0402-Ђ U0403-Ѓ U0404-Є U0405-Ѕ U0406-І U0407-Ї U0408-Ј U0409-Љ U040A-Њ U040B-Ћ U040C-Ќ U040E-Ў U040F-Џ U0410-А U0411-Б U0412-В U0413-Г U0414-Д U0415-Е U0416-Ж U0417-З U0418-И U0419-Й U041A-К U041B-Л U041C-М U041D-Н U041E-О U041F-П U0420-Р U0421-С U0422-Т U0423-У U0423 0301-У́ U0424-Ф U0425-Х U0426-Ц U0427-Ч U0428-Ш U0429-Щ U042A-ъ U042B-Ы U042C-ь U042D-Э U042E-Ю U042F-Я U0430-а U0431-б U0432-в U0433-г U0434-д U0435-е U0436-ж U0437-з U0438-и U0439-й U043A-к U043B-л U043C-м U043D-н U043E-о U043F-п U0440-р U0441-с U0442-т U0443-у U0443 0301-у́ U0444-ф U0445-х U0446-ц U0447-ч U0448-ш U0449-щ U044A-Ъ U044B-ы U044C-Ь U044D-э U044E-ю U044F-я U0451-ё U0452-ђ U0453-ѓ U0454-є U0455-ѕ U0456-і U0457-ї U0458-ј U0459-љ U045A-њ U045B-ћ U045C-ќ U045E-ў U045F-џ U046A-Ѫ U046B-ѫ U0472-Ѳ U0473-ѳ U0474-Ѵ U0475-ѵ U048C-Ҍ U048D-ҍ U0490-Ґ U0491-ґ U0492-Ғ U0493-ғ U0494-Ҕ U0495-ҕ U0496-Җ U0497-җ U049A-Қ U049B-қ U049E-Ҟ U049F-ҟ U04A2-Ң U04A3-ң U04A4-Ҥ U04A5-ҥ U04A6-Ҧ U04A7-ҧ U04A8-Ҩ U04A9-ҩ U04AA-Ҫ U04AB-ҫ U04AC-Ҭ U04AD-ҭ U04AE-Ү U04AF-ү U04B2-Ҳ U04B3-ҳ U04B4-Ҵ U04B5-ҵ U04BA-Һ U04BB-һ U04BC-Ҽ U04BD-ҽ U04BE-Ҿ U04BF-ҿ U04C0-Ӏ U04C1-Ӂ U04C2-ӂ U04CB-Ӌ U04CC-ӌ U04D0-Ӑ U04D1-ӑ U04D2-Ӓ U04D3-ӓ U04D6-Ӗ U04D7-ӗ U04D8-Ә U04D9-ә U04DC-Ӝ U04DD-ӝ U04DE-Ӟ U04DF-ӟ U04E0-Ӡ U04E1-ӡ U04E4-Ӥ U04E5-ӥ U04E6-Ӧ U04E7-ӧ U04E8-Ө U04E9-ө U04F0-Ӱ U04F1-ӱ U04F2-Ӳ U04F3-ӳ U04F4-Ӵ U04F5-ӵ U04F8-Ӹ U04F9-ӹ U2019-’
Hi, On 2018-10-10 15:19, Egor Kobylkin wrote: > On 10.10.2018 13:22, Marko Myllynen wrote: >>> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303 >> >> Although I haven't checked every rule this in general looks very good >> (but see below). > >> Not sure do we want to add the few missing characters >> mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode, >> e.g., one instantly notices that U+0400 is missing. (I wouldn't add at >> least initially the more exotic characters, like the historic ones, >> though.) Perhaps filing a bug or two for these cases for separate >> consideration would be ok. > > The question here is what should serve as their transliteration and > transcription? Not sure, so filing a separate bug about this once your patch is merged might be the most suitable action for now, I don't think we want to postpone merging your work further due to these non-ISO 9 cases. >> I'm not sure this will work, no existing rule in translit_* files >> contain two characters, I'd assume that the rule for U+0423 is applied >> first and then the below rule is never used. >> >> % CYRILLIC UNDEFINED >> <U0423><U0301> <U00DA>;"<U0055><U0060>" >> >> Perhaps this should be commented out or removed altogether if it's not >> working as intended. > > So yes, they are not processed. I would drop them to not to have special > cases. But I am also fine with keeping them because all work is done > already. I'd probably drop them but I don't feel strongly about this either way. Thanks for your efforts, I don't have any further comments, I'll leave this now for Rafal and Mike to provide additional feedback and hopefully merge soon. Thanks,
Hi, On 2018-10-09 19:10, Marko Myllynen wrote: > > One thing that might be helpful here could be something like: > > $ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE > ž > > That is, force transliteration of each character (if defined) even if > it's part of the target character set. AFAICS this is not currently > possible. FWIW, this is currently not possible with iconv(1) but uconv(1) supports this with -x (AFAICS it's using ICU not glibc locale data): https://en.wikipedia.org/wiki/uconv https://linux.die.net/man/1/uconv https://github.com/unicode-org/icu/tree/master/icu4c/source/extra/uconv Cheers,
diff -uNr a/localedata/locales/C b/localedata/locales/C --- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000 @@ -2292,6 +2292,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ --- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000 +++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000 @@ -70,6 +70,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA --- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000 +++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000 @@ -72,6 +72,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH --- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000 +++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000 @@ -56,6 +56,7 @@ copy "i18n" translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET --- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000 +++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000 @@ -1396,6 +1396,7 @@ <U137A> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> +include "translit_cyrillic";"" translit_end % END LC_CTYPE diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG --- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000 +++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000 @@ -44,6 +44,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY --- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000 @@ -69,6 +69,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM --- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000 @@ -42,6 +42,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ --- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000 @@ -166,6 +166,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA --- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000 @@ -86,6 +86,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG --- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000 @@ -49,6 +49,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU --- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000 @@ -39,6 +39,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD --- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000 @@ -63,6 +63,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN --- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000 @@ -43,6 +43,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES --- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000 @@ -72,6 +72,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU --- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000 @@ -39,6 +39,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ --- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000 +++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000 @@ -2311,6 +2311,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU --- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000 +++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000 @@ -109,6 +109,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB --- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000 +++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000 @@ -69,6 +69,7 @@ copy "i18n" translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK --- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000 +++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000 @@ -167,6 +167,7 @@ % LATIN SMALL LETTER O WITH STROKE -> "oe" <U00F8> "<U006F><U0338>";"<U006F><U0065>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE --- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000 +++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000 @@ -78,6 +78,7 @@ % DOUBLE HIGH-REVERSED-9 QUOTATION MARK <U201F> <U00AB>;<U0022> +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV --- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000 +++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000 @@ -52,6 +52,7 @@ include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT --- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000 +++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000 @@ -60,6 +60,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR --- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000 +++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000 @@ -59,6 +59,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB --- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000 +++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000 @@ -55,6 +55,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG --- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000 +++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000 @@ -50,6 +50,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM --- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000 +++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000 @@ -42,6 +42,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU --- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000 +++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000 @@ -59,6 +59,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES --- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000 +++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000 @@ -73,6 +73,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE --- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000 +++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000 @@ -109,6 +109,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR --- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000 +++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000 @@ -79,6 +79,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN --- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000 +++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000 @@ -42,6 +42,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI --- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000 +++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000 @@ -137,6 +137,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR --- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000 +++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000 @@ -59,6 +59,7 @@ % In France, accents are simply omitted if they cannot be represented. include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE --- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000 +++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000 @@ -54,6 +54,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB --- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000 +++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000 @@ -47,6 +47,7 @@ copy "i18n" translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN --- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000 +++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000 @@ -62,6 +62,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB --- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000 +++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000 @@ -57,6 +57,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL --- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000 +++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000 @@ -59,6 +59,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN --- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000 +++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000 @@ -61,6 +61,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ --- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000 +++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000 @@ -37,6 +37,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR --- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000 +++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000 @@ -153,6 +153,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT --- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000 +++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000 @@ -59,6 +59,7 @@ copy "i18n" translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU --- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000 +++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000 @@ -478,6 +478,7 @@ <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>" <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM --- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000 @@ -77,6 +77,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID --- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000 @@ -55,6 +55,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS --- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000 @@ -2161,6 +2161,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT --- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000 @@ -59,6 +59,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP --- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000 @@ -1682,6 +1682,7 @@ include "translit_combining";"" include "translit_cjk_variants";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ --- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000 @@ -158,6 +158,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH --- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000 @@ -873,6 +873,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN --- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000 @@ -63,6 +63,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR --- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000 @@ -6099,6 +6099,7 @@ include "translit_combining";"" include "translit_hangul";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN --- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000 @@ -46,6 +46,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB --- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000 @@ -58,6 +58,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU --- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000 @@ -78,6 +78,7 @@ % LATIN SMALL LETTER E WITH CIRCUMFLEX <U00EA> "<U0065><U005E>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG --- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000 @@ -57,6 +57,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT --- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000 +++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000 @@ -47,6 +47,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD --- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000 @@ -39,6 +39,7 @@ copy "i18n" translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA --- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000 @@ -51,6 +51,7 @@ copy "i18n" translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT --- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000 @@ -77,6 +77,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV --- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000 @@ -2122,6 +2122,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG --- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000 @@ -55,6 +55,7 @@ % Accents are simply omitted if they cannot be represented. include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU --- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000 @@ -59,6 +59,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK --- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000 @@ -49,6 +49,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN --- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000 @@ -60,6 +60,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE % diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY --- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000 @@ -45,6 +45,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT --- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000 @@ -47,6 +47,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin --- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000 @@ -53,6 +53,7 @@ % accents are simply omitted if they cannot be represented. include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO --- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000 @@ -154,6 +154,7 @@ % LATIN SMALL LETTER O WITH STROKE -> "oe" <U00F8> "<U006F><U0338>";"<U006F><U0065>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP --- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000 @@ -43,6 +43,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX --- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000 @@ -60,6 +60,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU --- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000 @@ -60,6 +60,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ --- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000 @@ -60,6 +60,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL --- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000 +++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000 @@ -57,6 +57,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA --- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000 @@ -66,6 +66,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR --- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000 @@ -62,6 +62,7 @@ copy "i18n" translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE --- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000 @@ -140,6 +140,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN --- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000 @@ -62,6 +62,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU --- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000 @@ -70,6 +70,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN --- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000 @@ -60,6 +60,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK --- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000 @@ -58,6 +58,7 @@ % Farsi yeh -> yeh <U06CC> "<U064A>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL --- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000 @@ -142,6 +142,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT --- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000 @@ -59,6 +59,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE --- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000 @@ -57,6 +57,7 @@ copy "i18n" translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO --- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000 @@ -144,6 +144,7 @@ <U0162> "<U021A>";"<U0054>" <U0163> "<U021B>";"<U0074>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU --- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000 @@ -74,6 +74,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW --- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000 @@ -45,6 +45,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN --- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000 @@ -44,6 +44,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN --- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000 @@ -46,6 +46,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari --- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000 +0000 @@ -44,6 +44,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK --- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000 @@ -39,6 +39,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO --- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000 @@ -205,6 +205,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT --- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000 @@ -59,6 +59,7 @@ copy "i18n" translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK --- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000 @@ -45,6 +45,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK --- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000 @@ -68,6 +68,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI --- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000 +++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000 @@ -91,6 +91,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS --- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000 @@ -37,6 +37,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO --- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000 @@ -70,6 +70,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL --- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000 @@ -45,6 +45,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA --- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000 @@ -68,6 +68,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA --- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000 @@ -64,6 +64,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE --- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000 @@ -139,6 +139,7 @@ % LATIN SMALL LETTER O WITH STROKE -> "oe" <U00F8> "<U006F><U0338>";"<U006F><U0065>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE --- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000 @@ -44,6 +44,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN --- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000 @@ -63,6 +63,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN --- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000 @@ -63,6 +63,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH --- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000 @@ -58,6 +58,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET --- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000 @@ -866,6 +866,7 @@ <U137C> <U0060><U0031><U0030><U0030><U0030><U0030> include "translit_combining";"" +include "translit_cyrillic";"" translit_end % END LC_CTYPE diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA --- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000 @@ -69,6 +69,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO --- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000 @@ -36,6 +36,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG --- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000 +++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000 @@ -37,6 +37,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR --- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000 @@ -2430,6 +2430,7 @@ % TURKISH LIRA SIGN <U20BA> "<U0054><U004C>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 +0000 +++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000 +0000 @@ -0,0 +1,151 @@ +escape_char / +comment_char % + +% Transliterations that converts cyrillic letters to ascii symbols inspired by GOST 7.79-2000 +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872 +% Generated from UnicodeData.txt with +% https://sourceware.org/bugzilla/attachment.cgi?id=8590 +% Up to three characters are required to do a reversible transliteration. + +LC_CTYPE + +translit_start + + +% CYRILLIC CAPITAL LETTER IO +<U0401> "<U0059><U004F>";<U0059> +% CYRILLIC CAPITAL LETTER A +<U0410> <U0041> +% CYRILLIC CAPITAL LETTER BE +<U0411> <U0042> +% CYRILLIC CAPITAL LETTER VE +<U0412> <U0056> +% CYRILLIC CAPITAL LETTER GHE +<U0413> <U0047> +% CYRILLIC CAPITAL LETTER DE +<U0414> <U0044> +% CYRILLIC CAPITAL LETTER IE +<U0415> <U0045> +% CYRILLIC CAPITAL LETTER ZHE +<U0416> "<U005A><U0048>";<U005A> +% CYRILLIC CAPITAL LETTER ZE +<U0417> <U005A> +% CYRILLIC CAPITAL LETTER I +<U0418> <U0049> +% CYRILLIC CAPITAL LETTER SHORT I +<U0419> <U004A> +% CYRILLIC CAPITAL LETTER KA +<U041A> <U004B> +% CYRILLIC CAPITAL LETTER EL +<U041B> <U004C> +% CYRILLIC CAPITAL LETTER EM +<U041C> <U004D> +% CYRILLIC CAPITAL LETTER EN +<U041D> <U004E> +% CYRILLIC CAPITAL LETTER O +<U041E> <U004F> +% CYRILLIC CAPITAL LETTER PE +<U041F> <U0050> +% CYRILLIC CAPITAL LETTER ER +<U0420> <U0052> +% CYRILLIC CAPITAL LETTER ES +<U0421> <U0053> +% CYRILLIC CAPITAL LETTER TE +<U0422> <U0054> +% CYRILLIC CAPITAL LETTER U +<U0423> <U0055> +% CYRILLIC CAPITAL LETTER EF +<U0424> <U0046> +% CYRILLIC CAPITAL LETTER HA +<U0425> <U0058> +% CYRILLIC CAPITAL LETTER TSE +<U0426> "<U0043><U005A>";<U0043> +% CYRILLIC CAPITAL LETTER CHE +<U0427> "<U0043><U0048>";<U0043> +% CYRILLIC CAPITAL LETTER SHA +<U0428> "<U0053><U0048>";<U0053> +% CYRILLIC CAPITAL LETTER SHCHA +<U0429> "<U0053><U0048><U0048>";<U0053> +% CYRILLIC CAPITAL LETTER HARD SIGN +<U042A> "<U0060><U0060>";<U0060> +% CYRILLIC CAPITAL LETTER YERU +<U042B> "<U0059><U0027>";<U0059> +% CYRILLIC CAPITAL LETTER SOFT SIGN +<U042C> <U0060> +% CYRILLIC CAPITAL LETTER E +<U042D> "<U0045><U0060>";<U0045> +% CYRILLIC CAPITAL LETTER YU +<U042E> "<U0059><U0055>";<U0059> +% CYRILLIC CAPITAL LETTER YA +<U042F> "<U0059><U0041>";<U0059> +% CYRILLIC SMALL LETTER A +<U0430> <U0061> +% CYRILLIC SMALL LETTER BE +<U0431> <U0062> +% CYRILLIC SMALL LETTER VE +<U0432> <U0076> +% CYRILLIC SMALL LETTER GHE +<U0433> <U0067> +% CYRILLIC SMALL LETTER DE +<U0434> <U0064> +% CYRILLIC SMALL LETTER IE +<U0435> <U0065> +% CYRILLIC SMALL LETTER ZHE +<U0436> "<U007A><U0068>";<U007A> +% CYRILLIC SMALL LETTER ZE +<U0437> <U007A> +% CYRILLIC SMALL LETTER I +<U0438> <U0069> +% CYRILLIC SMALL LETTER SHORT I +<U0439> <U006A> +% CYRILLIC SMALL LETTER KA +<U043A> <U006B> +% CYRILLIC SMALL LETTER EL +<U043B> <U006C> +% CYRILLIC SMALL LETTER EM +<U043C> <U006D> +% CYRILLIC SMALL LETTER EN +<U043D> <U006E> +% CYRILLIC SMALL LETTER O +<U043E> <U006F> +% CYRILLIC SMALL LETTER PE +<U043F> <U0070> +% CYRILLIC SMALL LETTER ER +<U0440> <U0072> +% CYRILLIC SMALL LETTER ES +<U0441> <U0073> +% CYRILLIC SMALL LETTER TE +<U0442> <U0074> +% CYRILLIC SMALL LETTER U +<U0443> <U0075> +% CYRILLIC SMALL LETTER EF +<U0444> <U0066> +% CYRILLIC SMALL LETTER HA +<U0445> <U0078> +% CYRILLIC SMALL LETTER TSE +<U0446> "<U0063><U007A>";<U0063> +% CYRILLIC SMALL LETTER CHE +<U0447> "<U0063><U0068>";<U0063> +% CYRILLIC SMALL LETTER SHA +<U0448> "<U0073><U0068>";<U0073> +% CYRILLIC SMALL LETTER SHCHA +<U0449> "<U0073><U0068><U0068>";<U0073> +% CYRILLIC SMALL LETTER HARD SIGN +<U044A> "<U0060><U0060>";<U0060> +% CYRILLIC SMALL LETTER YERU +<U044B> "<U0079><U0027>";<U0079> +% CYRILLIC SMALL LETTER SOFT SIGN +<U044C> <U0060> +% CYRILLIC SMALL LETTER E +<U044D> "<U0065><U0060>";<U0065> +% CYRILLIC SMALL LETTER YU +<U044E> "<U0079><U0075>";<U0079> +% CYRILLIC SMALL LETTER YA +<U044F> "<U0079><U0061>";<U0079> +% CYRILLIC SMALL LETTER IO +<U0451> "<U0079><U006F>";<U0079> + + +translit_end + +END LC_CTYPE diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA --- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000 @@ -64,6 +64,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US --- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000 @@ -48,6 +48,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN --- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000 @@ -46,6 +46,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK --- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000 @@ -58,6 +58,7 @@ % Farsi yeh -> yeh <U06CC> "<U064A>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA --- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000 @@ -67,6 +67,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN --- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000 @@ -58,6 +58,7 @@ % dong sign -> d// -> dd <U20AB> "<U0111>";"<U0064><U0064>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE --- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000 @@ -69,6 +69,7 @@ <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>" <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN --- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000 @@ -55,6 +55,7 @@ % Accents are simply omitted if they cannot be represented. include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA --- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000 @@ -66,6 +66,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US --- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000 @@ -73,6 +73,7 @@ <U05F0> "<U05D5><U05D5>";"<U0077><U0077>" <U05F1> "<U05D5><U05D9>";"<U0077><U006A>" <U05F2> "<U05D9><U05D9>";"<U006A><U006A>" +include "translit_cyrillic";"" translit_end END LC_CTYPE diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN --- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000 +++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000 @@ -58,6 +58,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end class "hanzi"; / diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA --- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000 +++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000 @@ -70,6 +70,7 @@ translit_start include "translit_combining";"" +include "translit_cyrillic";"" translit_end END LC_CTYPE