[asterisk-dev] [design] Matching algorithm
JR Richardson
jmr.richardson at gmail.com
Tue Jun 3 16:07:08 CDT 2008
> There is a bug that has been filed, number 12777, which describes a possible
> bug in the current matching algorithm. I say possible, because it could also
> be seen as a feature.
>
> Historically, in 1.2 and previous versions, the pattern matching was done
> based upon ASCII sorting. That is, "_1" sorted before "_X", so "_1" matched
> before "_X". This broke down from what people expected when you take into
> account class-based matching, such as "_[1-4]". Since "X" occurs earlier in
> the ASCII table than "[", "_X" matched before "_[1-4]".
>
> This was changed in 1.4, such that each class matched as if it were a single
> entity and more specific matches matched first. So the last example reversed
> the sorting priority order, "_[1-4]" now matches before "_X".
>
> In the matching code, however, there is an anomaly, in that if you use the
> lowercase versions of the common classes, that is "n", "x", and "z", they
> currently match earlier than more specific matches. That is, "_x" matches
> before "_[1-4]", which matches before "_X".
>
> Now, we are not going to change this behavior in 1.4, certainly. That has the
> potential to break currently working dialplans, and where we can reasonably
> foresee such an outcome, we'd like to avoid that.
>
> However, this is certainly an unintended behavior, and the question then
> becomes, do we document this as a way to override the pattern match
> algorithm, or do we change the lowercase class letters to behave the same
> as the uppercase class letters?
>
> I open the floor to more discussion.
Documentation and wiki:
Extension names are not limited to single specific extension
"numbers". A single extension can also match patterns. In the
extensions.conf file, an extension name is a pattern if it starts with
the underscore symbol (_). In an extension pattern, the following
characters have special meanings:
Special Characters for Pattern Matching
X matches any digit from 0-9
Z matches any digit from 1-9
N matches any digit from 2-9
[1237-9] matches any digit or letter in the brackets
(in this example, 1,2,3,7,8,9)
. wildcard, matches one or more characters
! wildcard, matches zero or more characters immediately
(only Asterisk 1.2 and later, see note)
I have not found any examples on lowercase character pattern matching,
so I don't think that very many people actually use it, I could be
wrong.
I think all lowercase characters should be literal. I also would not
mind a global parameter 'setlowercasematching=yes' but should default
to 'no'.
IMHO, any undocumented or unintentional behavior, good or bad, should
be labeled a bug and either fixed/taken out, or fully implemented
formally and supported.
--
Thanks.
JR
---------------------
JR Richardson
Engineering for the Masses
More information about the asterisk-dev
mailing list