[asterisk-dev] Better pattern matching

Thu Aug 2 12:09:12 CDT 2007

Vasil Kolev wrote:
> В ср, 2007-08-01 в 16:08 -0600, Steve Murphy написа:
> 
>> We might be able to extend the current algorithm to handle a few more
>> simple cases without resorting to such huge machinery. We shall see.
>>
> 
> Going through this thread, I'm thinking - regular expressions are parsed
> to finite state machines, and it shouldn't be hard (for some values of
> hard, I have to brush on my discrete maths memories) to build one
> automaton to handle the full dialplan for a context and to be able on
> each digit to cut the possible places to go. There's some stuff that
> can't be done this way for sure (as we all know, all kind of weird shit
> is possible with regex), but at least the current syntax can get a bit
> extended, and probably a lot of IVRs simplified.

The data structure that you're grasping for is the 'trie':
	http://en.wikipedia.org/wiki/Trie

At a previous job, one of my developers wrote a regex -> trie generator,
so we could do high-speed lookups of prefix and suffix matches on
usernames coming into a radius proxy.  Once we finally got it debugged,
it was amazingly fast.  For that particular appplication we had several
hundred prefix and suffix patterns, and generated one giant tree to
handle them all.

This was pretty slow to generate.  So we did the work in a helper
program, and wrote the binary trie into a disk file.  The main app
would stat() the file periodically, and if it was newer than
the last one read in, just suck the entire thing into memory, and
change the pointer to the root of the trie.  I guess we called
free on the old one.  Seamless, fast and worked extremely well...

Generating one trie per context would probably work really well.

-Kurt