[asterisk-dev] [design] Matching algorithm

Tue Jun 3 14:25:12 CDT 2008

On Tue, Jun 03, 2008 at 12:23:32PM -0600, Steve Murphy wrote:
> On Tue, 2008-06-03 at 10:34 -0500, Tilghman Lesher wrote:
> > There is a bug that has been filed, number 12777, which describes
> > a possible bug in the current matching algorithm. I say possible,
> > because it could also be seen as a feature.
> >
> > Historically, in 1.2 and previous versions, the pattern matching was
> > done based upon ASCII sorting. That is, "_1" sorted before "_X", so
> > "_1" matched before "_X". This broke down from what people expected
> > when you take into account class-based matching, such as "_[1-4]".
> > Since "X" occurs earlier in the ASCII table than "[", "_X" matched
> > before "_[1-4]".
> >
> > This was changed in 1.4, such that each class matched as if it were
> > a single entity and more specific matches matched first. So the last
> > example reversed the sorting priority order, "_[1-4]" now matches
> > before "_X".
> >
> > In the matching code, however, there is an anomaly, in that if you
> > use the lowercase versions of the common classes, that is "n", "x",
> > and "z", they currently match earlier than more specific matches.
> > That is, "_x" matches before "_[1-4]", which matches before "_X".
> >
> > Now, we are not going to change this behavior in 1.4, certainly.
> > That has the potential to break currently working dialplans, and
> > where we can reasonably foresee such an outcome, we'd like to avoid
> > that.
> >
> > However, this is certainly an unintended behavior, and the question
> > then becomes, do we document this as a way to override the pattern
> > match algorithm, or do we change the lowercase class letters to
> > behave the same as the uppercase class letters?
> >
> > I open the floor to more discussion.
>
> I thought I'd weigh in on this, as I've done a share of pattern
> matching hacking.
>
> IIrc, I'm pretty sure my fast pattern matcher does a pass over the
> string and upcases all the NXZ's. This is in trunk and 1.6.0 at the
> moment. If you guys decide that lowercase is not a matching pattern,
> this will have ramifications on *that* code, and I'll have to mod it
> to duplicate the old behavior.
>
> I'm willing to abide by whatever's decided, tho. It's a pretty minor
> tweak to the code. (I believe/think/hope)

So... do I have this straight:  The *functionality* of the characters
doesn't change by case, only the sorting of the items they comprise?

This was a wart, then, when it cropped up in 1.4?

To answer your question from my perspective, given:

> > Now, we are not going to change this behavior in 1.4, certainly.
> > That has the potential to break currently working dialplans, and
> > where we can reasonably foresee such an outcome, we'd like to avoid
> > that.
> >
> > However, this is certainly an unintended behavior, and the question
> > then becomes, do we document this as a way to override the pattern
> > match algorithm, or do we change the lowercase class letters to
> > behave the same as the uppercase class letters?

Unless I'm misunderstanding you, the latter alternative directly
contradicts the assertion in the first graf: you can't.  If you think
people are depending on the undocumented "override" behavior, then you
have to maintain it.

Or did I misunderstand?

Cheers,
- jra
-- 
Jay R. Ashworth                   Baylink                      jra at baylink.com
Designer                     The Things I Think                       RFC 2100
Ashworth & Associates     http://baylink.pitas.com                     '87 e24
St Petersburg FL USA      http://photo.imageinc.us             +1 727 647 1274

	     Those who cast the vote decide nothing.
	     Those who count the vote decide everything.
	       -- (Joseph Stalin)