GSoC 2010 – week summary: May 31st – June 6th
After implementing CLDR reader last week, I focused on various classes which would use data from Common Locale Data Repository.
I started with PluralsReader, a class which will be used in messages (labels) translation process. This class basically returns a form-name (a string, one of: zero, one, two, few, many, other) for a pair of locale and number. In some languages form of noun depends on a number expressed in a sentence – for example, unit such as time or currency. English has only two rules (forms “one” and “other”), for example:
form "one": 1 day form "other": 0 days, 2 days, 5 days, ...
A rule used here is simple, and can be written like this:
<pluralRule count="one">n is 1</pluralRule>
(it is actually how rules are defined in CLDR)
For other languages, rules can be very complicated:
<pluralRules locales="hr ru sr uk be bs sh">
<pluralRule count="one">n mod 10 is 1 and n mod 100 is not 11</pluralRule>
<pluralRule count="few">n mod 10 in 2..4 and n mod 100 not in 12..14</pluralRule>
<pluralRule count="many">n mod 10 is 0 or n mod 10 in 5..9 or n mod 100 in 11..14</pluralRule>
<!-- rest are plurals -->
</pluralRules>
PluralsReader class can parse these rules and define which form should be used for particular number. I committed this class (among others related) in Revision 4399.
When I wrote plurals reader, I started with another, similar class – NumbersReader. This one is more complicated. It can format a number (float or integer) using a format string. Syntax of format strings (patterns) is defined in CLDR. It’s very flexible and pretty complicated.
I didn’t implement all features described in CLDR (or to be precise: in Unicode Technical Standard #35), although I wanted the NumbersReader to support as much of the syntax as possible. I came across a solution of similar problem in Yii Framework codebase – it was very good starting point for me.
These are examples of formats supported by NumbersReader (one format per line):
#,##0.### ##0% #,##0.00 00000.0000 '#,##0.0;(#) ¤ #,##0.00;¤ #,##0.00- #,##0.05
NumbersReader class can parse format – it stores all parsed formats in the cache – and then can format a number using parsed representation of the format. It has methods for formatting decimal / percent / currency numbers (formats are extracted from CLDR). Formatting with custom format is also possible.
NumbersReader was commited in Revision 4445.
Next week I will work on similar Reader class for date and time. Hopefully I will also have time to do something with currency formatting – as for now, NumbersReader can format a value with currency sign, but it is simplified (just replacing currency placeholder with currency sign provided). CLDR has pretty extensive data for currencies.
-
Michael Sauter
-
Karol Gusak
-
Zach Davis
-
Kamran Riaz Khan
-
Karol Gusak


