RFC: locale-source validation script
Commit Message
Zack Weinberg <zackw@panix.com> wrote:
> On Wed, Jul 26, 2017 at 8:44 AM, Mike FABIAN <mfabian@redhat.com> wrote:
>> Zack Weinberg <zackw@panix.com> wrote:
>>
>>> - The complaints about "inappropriate character '\t'" are all caused
>>> by _unintentional_ tabs inside strings. If you write
>>>
>>> message "xyz/
>>> abc"
>>>
>>> the whitespace on the second line gets included in the string, which
>>> is not what you want.
>>
>> Yes, at the moment we get for example:
>>
>> $ LC_ALL=et_EE.UTF-8 locale -k postal_fmt
>> postal_fmt="%a%N %f%N %d%N %b%N %s%t%h%t%e%t%r%N %C-%z %T%N %c%N"
>>
>> I’ll fix it like this, this is far more readable as well:
>
> Note that there's probably a bunch of similar cases where the
> undesirable whitespace is just space characters, no tabs - my script
> won't catch that. (I won't be working on it today, but this is on my
> list of things to fix.)
Just as a quick hack to find these cases I added the following to
your script to find sequences of 2 or more spaces in strings:
This found only 3 instances of space sequences, all of them appeared
to be errors and I pushed a fix.
@@ -369,6 +369,9 @@
if c != ' ' and not isgraph(c):
log.error(fp.lineno, "inappropriate character {!r} in {}",
c, "string" if end_char == '"' else "symbol")
+ if c == ' ' and tbuf[-1:] == [' ']:
+ log.error(fp.lineno, "suspicous sequence of spaces {}", tbuf)
+ tbuf.append(c)
else:
tbuf.append(c)