[asterisk-dev] [design] Matching algorithm

Tue Jun 3 16:07:08 CDT 2008

> There is a bug that has been filed, number 12777, which describes a possible
> bug in the current matching algorithm.  I say possible, because it could also
> be seen as a feature.
>
> Historically, in 1.2 and previous versions, the pattern matching was done
> based upon ASCII sorting.  That is, "_1" sorted before "_X", so "_1" matched
> before "_X".  This broke down from what people expected when you take into
> account class-based matching, such as "_[1-4]".  Since "X" occurs earlier in
> the ASCII table than "[", "_X" matched before "_[1-4]".
>
> This was changed in 1.4, such that each class matched as if it were a single
> entity and more specific matches matched first.  So the last example reversed
> the sorting priority order, "_[1-4]" now matches before "_X".
>
> In the matching code, however, there is an anomaly, in that if you use the
> lowercase versions of the common classes, that is "n", "x", and "z", they
> currently match earlier than more specific matches.  That is, "_x" matches
> before "_[1-4]", which matches before "_X".
>
> Now, we are not going to change this behavior in 1.4, certainly.  That has the
> potential to break currently working dialplans, and where we can reasonably
> foresee such an outcome, we'd like to avoid that.
>
> However, this is certainly an unintended behavior, and the question then
> becomes, do we document this as a way to override the pattern match
> algorithm, or do we change the lowercase class letters to behave the same
> as the uppercase class letters?
>
> I open the floor to more discussion.

Documentation and wiki:

Extension names are not limited to single specific extension
"numbers". A single extension can also match patterns. In the
extensions.conf file, an extension name is a pattern if it starts with
the underscore symbol (_). In an extension pattern, the following
characters have special meanings:

Special Characters for Pattern Matching

   X          matches any digit from 0-9
   Z          matches any digit from 1-9
   N          matches any digit from 2-9
   [1237-9]   matches any digit or letter in the brackets
              (in this example, 1,2,3,7,8,9)
   .          wildcard, matches one or more characters
   !          wildcard, matches zero or more characters immediately
              (only Asterisk 1.2 and later, see note)

I have not found any examples on lowercase character pattern matching,
so I don't think that very many people actually use it, I could be
wrong.

I think all lowercase characters should be literal.  I also would not
mind a global parameter 'setlowercasematching=yes' but should default
to 'no'.

IMHO, any undocumented or unintentional behavior, good or bad, should
be labeled a bug and either fixed/taken out, or fully implemented
formally and supported.

-- 
Thanks.
JR
---------------------
JR Richardson
Engineering for the Masses