revision of 'say' Re: [asterisk-dev] bugs a plenty - discuss....
Luigi Rizzo
rizzo at icir.org
Fri Mar 10 11:30:33 MST 2006
posting here but maybe should go to -users as well because
people might help from there as well.
this is a call to all non-english speakers to
help designing the config files for the text-based
"say" implementation.
What we basically need is a set of rules that map
numbers, dates and times in the individual components.
These rules are written in a way that is similar to
dialplan entries - each rule will have on the left
side a pattern to match some numbers/dates/times, and
on the right side a sequence of components that should be
spelled out.
As an example, i am attaching a simple configuration for
english and italian numbers, and enumerations.
If you have a look at the comments, perhaps you can
come out with a description for your language,
and point out exceptions that you don't know how
to represent with this scheme (so we need to find
a solution or enrich the scheme to support your
requirement).
Keep in mind that in the end, in asterisk pronouncing
a number or a date or a time means mapping it into
a sequence of files that must be played out and contain
the components of the number.
Your feedback required. If someone feels like posting
this to -user, please do so.
Feedback to me or to the list, as you like
And if you want to try the code that implements this
stuff, it is in my branch team/rizzo/base, with the
configuration file in configs/say.conf.sample
(to be copied into say.conf), and the actions can
be triggered by dialplan lines of the kind
exten => _X.,1,PlayBack(${EXTEN}|say) ; for numbers
exten => _X.,1,PlayBack(date|say) ; for dates
remember, you have to fill say.conf with your patterns.
cheers
luigi
----------------------------------------------------
The configuration for each language is in a section
named with the language in the file say.conf.
Take the case for english numbers, we have the following:
- the section name is
[en]
(for italian it would have been [it] )
- leading zeros are not significant, so we skip them and
pronounce the remaining part. The corresponding rule is
_0. => say:${SAY:1}
where the left matches a 0 digit followed by 1 or more components,
and the right side is just a recursive invocation of 'say' with an
argument which is the string (in a variable named SAY)minus the
first character (ordinary asterisk variable syntax)
- single-digit numbers are prononced as they are, so the rule is
_X => digits/${SAY}
where the pattern on the left matches any single digit (including 0)
and the right side is a filename
- two-digit numbers between 10 and 19 are pronounced as a single word, so
_1X => digits/${SAY}
same as above, the left pattern matches those numbers, the right hand
side maps to a filename
- also multiples of 10 are a single word, so we have a similar rule
_[2-9]0 => digits/${SAY}
- other two-digit numbers are two words e,g, 83 is 80 followed by 3,
and the rule is the following:
_[2-9][1-9] => digits/${SAY:0:1}0, say:${SAY:1}
here as you see the right hand side has two parts, a file name and
a recursive invocation. I could have written an equivalent rule
_[2-9][1-9] => digits/${SAY:0:1}0, digits/${SAY:1:1}
- three-digit numbers are made of three words, as follows
_XXX => say:${SAY:0:1}, digits/hundred, say:${SAY:1}
or equivalently
_XXX => digits/${SAY:0:1}, digits/hundred, say:${SAY:1}
Note that in writing this rule we rely on the fact that asterisk
has a 'shortest pattern match' algorithm - ie a number such as
053 would also match _0. pattern, which is shorter thus gets selected.
If we don't rely on that, we should write the pattern as
_0XX => digits/hundred, say:${SAY:1}
_[1-9]XX => say:${SAY:0:1}, digits/hundred, say:${SAY:1}
- and so on for thousands and millions...
_XXXX => say:${SAY:0:1}, digits/thousand, say:${SAY:1}
_XXXXX => say:${SAY:0:2}, digits/thousand, say:${SAY:2}
_XXXXXX => say:${SAY:0:3}, digits/thousand, say:${SAY:3}
_XXXXXXX => say:${SAY:0:1}, digits/million, say:${SAY:1}
_XXXXXXXX => say:${SAY:0:2}, digits/million, say:${SAY:2}
_XXXXXXXXX => say:${SAY:0:3}, digits/million, say:${SAY:3}
_XXXXXXXXXX => say:${SAY:0:1}, digits/billion, say:${SAY:1}
_XXXXXXXXXXX => say:${SAY:0:2}, digits/billion, say:${SAY:2}
_XXXXXXXXXXXX => say:${SAY:0:3}, digits/billion, say:${SAY:3}
Enumerations are identified by a special prefix string 'enum' but other
than that the same reasoning applies: we select the pattern on the
left, and play (directly or recursively) its components, which can
be plain numbers or enumerations. So the rules start with
; enumeration
; single digit
_enum:X => digits/h-${SAY}
; eleventh..nineteenth
_enum:1X => digits/h-${SAY}
; twentyeth, thirtyeth...
_enum:[2-9]0 => digits/h-${SAY}
; twenty first, twenty second... ninety ninth
_enum:[2-9][1-9] => say:${SAY:0:1}0, digits/h-${SAY:1}
; X hundred twenty fifth ...
_enum:[1-9]XX => say:${SAY:0:1}, digits/hundred, say:enum:${SAY:1}
For dates and times, we have special prefixes.
say_date or say_time translates in the pattern 'date' or 'time',
and its components (day, day of week, minutes...) in the right hand
side can be identified by %x where x is a character from strftime
(e.g. %Y means 'the full year number). This parameter then matches
rules with the prefix _c: (e.g. _Y:. for the year) and at this point
the variable SAY contains a number which can be used to build a
file name or pronounce a number. As an example the syntax for date
in italian order (day, month, year) is
; any date is prononuced as day month year
_date => say:%d, say:%m, say:%Y
; any day irrespective of the value is pronounced as a number
_d:. => say:${SAY}
; in fact if we are picky, the first of the month is an ordinal
; so the rule should be
_d:1 => digits/h-1
_d:[2-9] => digits/${SAY}
_d:[1-3][0-9] => digits/${SAY:0:1}, digits/${SAY:1:1}
; then the month is just the month's name
_m:. => digits/mon-${SAY}
; and the year is just a number
_Y:. => say:${SAY}
; whereas in english it would be something like
_Y:1[1-9]XX => say:${SAY:0:2}, digits/hundred, say:${SAY:2:2}
_Y:20XX => say:2000, say:${SAY:2:2}
etc.
Hope you get the idea.
More information about the asterisk-dev
mailing list