From patchwork Fri Feb 25 12:35:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carlos O'Donell X-Patchwork-Id: 51395 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5B42B3857C59 for ; Fri, 25 Feb 2022 12:37:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5B42B3857C59 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1645792624; bh=Eqq2qWYvRqdnR+EEIgdxqqbLP+zwMg3jQBZqd6U45MQ=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=WotSAZA8JCr2z0kXeNTwBRrqa/awlDrNawhVMavzo8QYmjlVzEfej6HELgkO0Azcv 78NVZEBdMgEg9/pxXrkSPe7CdwiVUTP/r7Zj8uY6/BQos5xv0jYIjkeYocnQCEk1bJ mOTHXoWZaTQE0Clcg2bK08jFO4Yyx9hppgoXqX4U= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 51E483858432 for ; Fri, 25 Feb 2022 12:36:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 51E483858432 Received: from mail-il1-f197.google.com (mail-il1-f197.google.com [209.85.166.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-670-PX8X7h8WMrG0n-UY5Mp3_Q-1; Fri, 25 Feb 2022 07:35:59 -0500 X-MC-Unique: PX8X7h8WMrG0n-UY5Mp3_Q-1 Received: by mail-il1-f197.google.com with SMTP id 11-20020a056e0220cb00b002c299cb036cso2689652ilq.3 for ; Fri, 25 Feb 2022 04:35:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Eqq2qWYvRqdnR+EEIgdxqqbLP+zwMg3jQBZqd6U45MQ=; b=1UhbvMd+iyUSQF7ePoX/qOW6hMJsoixmKLr3mRfkH0AMlFVQs8+KO2HvFvngAx6Po7 5jttBSeenM1CZJtLa+taXdSFszodifN/3hGqj39NpBVgqTpSxTVd6ggFaQ+VsCGs6mnx t3w7xdqfoWmPvzWt40h26gnkE6Rv+BLPjGTbmL6ACajxdgrzOMDX4upBfbcN6J+bqZ1c /ezYAI4EUJaoAbwlyP+FaJRtLKc4AHu91Esa2Hm7Ic1AlrAMBHr4i5p370ieOttQMnIe dFudegVrWfH1DJv0fdaurfyt17pJvBeek8IWLpb2NiGFN3yfbSO0EszqfZv5sAHrSlff vxrA== X-Gm-Message-State: AOAM531jMWyTxR9xHtHkIQA8zJjUZGFfP+L4o7bpct+j/xr4rn0q12Jx WA6ilwmiJGyLRhWHHBA2tMLD80i0q358TOJZtpiHRMEP8NaoiGyEVuuwJQF60dhx+uDqz9ogO/T 0ekmoJGb7uXRCGOAb1cuxMOErpwYgVeNhiGp5lLc2bcWEnvVyXIdss8erMPw+C/xbvlMsBA== X-Received: by 2002:a6b:f417:0:b0:5ed:2a68:c19b with SMTP id i23-20020a6bf417000000b005ed2a68c19bmr5364396iog.22.1645792558695; Fri, 25 Feb 2022 04:35:58 -0800 (PST) X-Google-Smtp-Source: ABdhPJwvfY3z0YHwtZ5g/9Uwoj8Mu1Ry9+M1wDftTcuISLfJfEM3laoFxHZ4INqbUakJ+WyuNy2KXw== X-Received: by 2002:a6b:f417:0:b0:5ed:2a68:c19b with SMTP id i23-20020a6bf417000000b005ed2a68c19bmr5364377iog.22.1645792558177; Fri, 25 Feb 2022 04:35:58 -0800 (PST) Received: from athas.redhat.com (135-23-175-80.cpe.pppoe.ca. [135.23.175.80]) by smtp.gmail.com with ESMTPSA id a2-20020a056e02120200b002c21a18437csm1535994ilq.40.2022.02.25.04.35.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Feb 2022 04:35:57 -0800 (PST) To: libc-alpha@sourceware.org Subject: [COMMITTED v3 1/2] localedef: Update LC_MONETARY handling (Bug 28845) Date: Fri, 25 Feb 2022 07:35:53 -0500 Message-Id: <20220225123554.964847-2-carlos@redhat.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220225123554.964847-1-carlos@redhat.com> References: <20220225123554.964847-1-carlos@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carlos O'Donell via Libc-alpha From: Carlos O'Donell Reply-To: Carlos O'Donell Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" ISO C17, POSIX Issue 7, and ISO 30112 all allow the char* types to be empty strings i.e. "", integer or char values to be -1 or CHAR_MAX respectively, with the exception of decimal_point which must be non-empty in ISO C. Note that the defaults for mon_grouping vary, but are functionaly equivalent e.g. "\177" (no further grouping reuqired) vs. "" (no grouping defined for all groups). We include a broad comment talking about harmonizing ISO C, POSIX, ISO 30112, and the default C/POSIX locale for glibc. We reorder all setting based on locale/categories.def order. We soften all missing definitions from errors to warnings when defaults exist. Given that ISO C, POSIX and ISO 30112 allow the empty string we change LC_MONETARY handling of mon_decimal_point to allow the empty string. If mon_decimal_point is not defined at all then we pick the existing legacy glibc default value of i.e. ".". We also set the default for mon_thousands_sep_wc at the same time as mon_thousands_sep, but this is not a change in behaviour, it is always either a matching value or L'\0', but if in the future we change the default to a non-empty string we would need to update both at the same time. Tested on x86_64 and i686 without regressions. Tested with install-locale-archive target. Tested with install-locale-files target. Reviewed-by: DJ Delorie --- locale/programs/ld-monetary.c | 182 +++++++++++++++++++++++++++------- 1 file changed, 146 insertions(+), 36 deletions(-) diff --git a/locale/programs/ld-monetary.c b/locale/programs/ld-monetary.c index 3b0412b405..18698bbe94 100644 --- a/locale/programs/ld-monetary.c +++ b/locale/programs/ld-monetary.c @@ -196,21 +196,105 @@ No definition for %s category found"), "LC_MONETARY"); } } + /* Generally speaking there are 3 standards the define the default, + warning, and error behaviour of LC_MONETARY. They are ISO/IEC TR 30112, + ISO/IEC 9899:2018 (ISO C17), and POSIX.1-2017. Within 30112 we have the + definition of a standard i18n FDCC-set, which for LC_MONETARY has the + following default values: + int_curr_symbol "" + currency_symbol "" + mon_decimal_point "" i.e. "," + mon_thousand_sep "" + mon_grouping "\177" i.e. CHAR_MAX + positive_sign "" + negative_sign "" i.e. "." + int_frac_digits -1 + frac_digits -1 + p_cs_precedes -1 + p_sep_by_space -1 + n_cs_precedes -1 + n_sep_by_space -1 + p_sign_posn -1 + n_sign_posn -1 + Under 30112 a keyword that is not provided implies an empty string "" + for string values or a -1 for integer values, and indicates the value + is unspecified with no default implied. No errors are considered. + The exception is mon_grouping which is a string with a terminating + CHAR_MAX. + For POSIX Issue 7 we have: + https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html + and again values not provided default to "" or -1, and indicate the value + is not available to the locale. The exception is mon_grouping which is + a string with a terminating CHAR_MAX. For the POSIX locale the values of + LC_MONETARY should be: + int_curr_symbol "" + currency_symbol "" + mon_decimal_point "" + mon_thousands_sep "" + mon_grouping "\177" i.e. CHAR_MAX + positive_sign "" + negative_sign "" + int_frac_digits -1 + frac_digits -1 + p_cs_precedes -1 + p_sep_by_space -1 + n_cs_precedes -1 + n_sep_by_space -1 + p_sign_posn -1 + n_sign_posn -1 + int_p_cs_precedes -1 + int_p_sep_by_space -1 + int_n_cs_precedes -1 + int_n_sep_by_space -1 + int_p_sign_posn -1 + int_n_sign_posn -1 + Like with 30112, POSIX also considers no error if the keywords are + missing, only that if the cateory as a whole is missing the referencing + of the category results in unspecified behaviour. + For ISO C17 there is no default value provided, but the localeconv + specification in 7.11.2.1 admits that members of char * type may point + to "" to indicate a value is not available or is of length zero. + The exception is decimal_point (not mon_decimal_point) which must be a + defined non-empty string. The values of char, which are generally + mapped to integer values in 30112 and POSIX, must be non-negative + numbers that map to CHAR_MAX when a value is not available in the + locale. + In ISO C17 for the "C" locale all values are empty strings "", or + CHAR_MAX, with the exception of decimal_point which is "." (defined + in LC_NUMERIC). ISO C17 makes no exception for mon_grouping like + 30112 and POSIX, but a value of "" is functionally equivalent to + "\177" since neither defines a grouping (though the latter terminates + the grouping). + + Lastly, we must consider the legacy C/POSIX locale that implemented + as a builtin in glibc and wether a default value mapping to the + C/POSIX locale may benefit the user from a compatibility perspective. + + Thus given 30112, POSIX, ISO C, and the builtin C/POSIX locale we + need to pick appropriate defaults below. */ + + /* The members of LC_MONETARY are handled in the order of their definition + in locale/categories.def. Please keep them in that order. */ + + /* The purpose of TEST_ELEM is to define a default value for the fields + in the category if the field was not defined in the cateory. If the + category was present but we didn't see a definition for the field then + we also issue a warning, otherwise the only warning you get is the one + earlier when a default category is created (completely missing category). + This missing field warning is glibc-specific since no standard requires + this warning, but we consider it valuable to print a warning for all + missing fields in the category. */ #define TEST_ELEM(cat, initval) \ if (monetary->cat == NULL) \ { \ if (! nothing) \ - record_error (0, 0, _("%s: field `%s' not defined"), \ - "LC_MONETARY", #cat); \ + record_warning (_("%s: field `%s' not defined"), \ + "LC_MONETARY", #cat); \ monetary->cat = initval; \ } + /* Keyword: int_curr_symbol. */ TEST_ELEM (int_curr_symbol, ""); - TEST_ELEM (currency_symbol, ""); - TEST_ELEM (mon_thousands_sep, ""); - TEST_ELEM (positive_sign, ""); - TEST_ELEM (negative_sign, ""); - /* The international currency symbol must come from ISO 4217. */ if (monetary->int_curr_symbol != NULL) { @@ -247,41 +331,63 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"), } } - /* The decimal point must not be empty. This is not said explicitly - in POSIX but ANSI C (ISO/IEC 9899) says in 4.4.2.1 it has to be - != "". */ + /* Keyword: currency_symbol */ + TEST_ELEM (currency_symbol, ""); + + /* Keyword: mon_decimal_point */ + /* ISO C17 7.11.2.1.3 explicitly allows mon_decimal_point to be the + empty string e.g. "". This indicates the value is not available in the + current locale or is of zero length. However, if the value was never + defined then we issue a warning and use a glibc-specific default. ISO + 30112 in the i18n FDCC-Set uses ",", and POSIX Issue 7 in the + POSIX locale uses "". It is specific to glibc that the default is + "."; we retain this existing behaviour for backwards compatibility. */ if (monetary->mon_decimal_point == NULL) { if (! nothing) - record_error (0, 0, _("%s: field `%s' not defined"), - "LC_MONETARY", "mon_decimal_point"); + record_warning (_("%s: field `%s' not defined, using defaults"), + "LC_MONETARY", "mon_decimal_point"); monetary->mon_decimal_point = "."; monetary->mon_decimal_point_wc = L'.'; } - else if (monetary->mon_decimal_point[0] == '\0' && ! be_quiet && ! nothing) + + /* Keyword: mon_thousands_sep */ + if (monetary->mon_thousands_sep == NULL) { - record_error (0, 0, _("\ -%s: value for field `%s' must not be an empty string"), - "LC_MONETARY", "mon_decimal_point"); + if (! nothing) + record_warning (_("%s: field `%s' not defined, using defaults"), + "LC_MONETARY", "mon_thousands_sep"); + monetary->mon_thousands_sep = ""; + monetary->mon_thousands_sep_wc = L'\0'; } + /* Keyword: mon_grouping */ if (monetary->mon_grouping_len == 0) { if (! nothing) - record_error (0, 0, _("%s: field `%s' not defined"), - "LC_MONETARY", "mon_grouping"); - + record_warning (_("%s: field `%s' not defined"), + "LC_MONETARY", "mon_grouping"); + /* Missing entries are given 1 element in their bytearray with + a value of CHAR_MAX which indicates that "No further grouping + is to be performed" (functionally equivalent to ISO C's "C" + locale default of ""). */ monetary->mon_grouping = (char *) "\177"; monetary->mon_grouping_len = 1; } + /* Keyword: positive_sign */ + TEST_ELEM (positive_sign, ""); + + /* Keyword: negative_sign */ + TEST_ELEM (negative_sign, ""); + #undef TEST_ELEM #define TEST_ELEM(cat, min, max, initval) \ if (monetary->cat == -2) \ { \ if (! nothing) \ - record_error (0, 0, _("%s: field `%s' not defined"), \ - "LC_MONETARY", #cat); \ + record_warning (_("%s: field `%s' not defined"), \ + "LC_MONETARY", #cat); \ monetary->cat = initval; \ } \ else if ((monetary->cat < min || monetary->cat > max) \ @@ -300,16 +406,11 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"), TEST_ELEM (p_sign_posn, -1, 4, -1); TEST_ELEM (n_sign_posn, -1, 4, -1); - /* The non-POSIX.2 extensions are optional. */ - if (monetary->duo_int_curr_symbol == NULL) - monetary->duo_int_curr_symbol = monetary->int_curr_symbol; - if (monetary->duo_currency_symbol == NULL) - monetary->duo_currency_symbol = monetary->currency_symbol; - - if (monetary->duo_int_frac_digits == -2) - monetary->duo_int_frac_digits = monetary->int_frac_digits; - if (monetary->duo_frac_digits == -2) - monetary->duo_frac_digits = monetary->frac_digits; + /* Keyword: crncystr */ + monetary->crncystr = (char *) xmalloc (strlen (monetary->currency_symbol) + + 2); + monetary->crncystr[0] = monetary->p_cs_precedes ? '-' : '+'; + strcpy (&monetary->crncystr[1], monetary->currency_symbol); #undef TEST_ELEM #define TEST_ELEM(cat, alt, min, max) \ @@ -327,6 +428,17 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"), TEST_ELEM (int_p_sign_posn, p_sign_posn, -1, 4); TEST_ELEM (int_n_sign_posn, n_sign_posn, -1, 4); + /* The non-POSIX.2 extensions are optional. */ + if (monetary->duo_int_curr_symbol == NULL) + monetary->duo_int_curr_symbol = monetary->int_curr_symbol; + if (monetary->duo_currency_symbol == NULL) + monetary->duo_currency_symbol = monetary->currency_symbol; + + if (monetary->duo_int_frac_digits == -2) + monetary->duo_int_frac_digits = monetary->int_frac_digits; + if (monetary->duo_frac_digits == -2) + monetary->duo_frac_digits = monetary->frac_digits; + TEST_ELEM (duo_p_cs_precedes, p_cs_precedes, -1, 1); TEST_ELEM (duo_p_sep_by_space, p_sep_by_space, -1, 2); TEST_ELEM (duo_n_cs_precedes, n_cs_precedes, -1, 1); @@ -349,17 +461,15 @@ not correspond to a valid name in ISO 4217 [--no-warnings=intcurrsym]"), if (monetary->duo_valid_to == 0) monetary->duo_valid_to = 99991231; + /* Keyword: conversion_rate */ if (monetary->conversion_rate[0] == 0) { monetary->conversion_rate[0] = 1; monetary->conversion_rate[1] = 1; } - /* Create the crncystr entry. */ - monetary->crncystr = (char *) xmalloc (strlen (monetary->currency_symbol) - + 2); - monetary->crncystr[0] = monetary->p_cs_precedes ? '-' : '+'; - strcpy (&monetary->crncystr[1], monetary->currency_symbol); + /* A value for monetary-decimal-point-wc was set when + monetary_decimal_point was set, likewise for monetary-thousands-sep-wc. */ }