[Dundi] Caching

Sat Oct 23 17:29:51 CDT 2004

At 4:55 PM -0500 on 10/23/04, Mark Spencer wrote:
>>	I would rather see a mechanism for Announcers to push changes into
>>the peer group. Changes delays then become a function of convergence time
>>rather than TTL.
>
>This becomes totally unmanagable with a large system because 
>essentially every node must know the routes for all numbers as they 
>come and go. Definitely a bad idea.  However, we recently added a 
>"PRECACHE" function to DUNDI so that, for example, smaller nodes 
>could "PRECACHE" answers up to a more master node.
>
>If this works, then we could setup a "best practices" policy that 
>nodes with less than a thousand (or whatever the magic number is) 
>routes would simply "PRECACHE" their answers up to one or two other 
>nodes, rather than actually be queried the whole time.
>
>What do people think about this?

I think that's reasonable.  At a minimum, it would allow for much 
faster lookup times in an enterprise environment that was 
topologically distributed by allowing a full cache push.  Even a 1 or 
2 second delay in some companies during a number lookup is perceived 
to be "too long".  Whaddevah.

The trick on this is enforcement of the "magic" number by the caching 
host.  In other words, don't let someone dump a million routes into 
your cache.  If you want to protect the core, some type of backoff is 
required when accepting large amounts of updates.  Careful that this 
is configurable; there may be some nodes that due to natural growth 
will see loads that appear to be flooding of pre-cache data, so this 
needs to be a turnable knob.

I'd say that as a start, 1000 numbers would be reasonable as a 
hardcoded default as a precache push if that option was turned on on 
the "announcer".  Similarly, 1000 numbers within 10 seconds might be 
a reasonable trigger (rolling average?) at which a "listener" would 
start to back off or slow down ACKs.  Again, this should be able to 
be tuned per DUNDi "peer", as some people won't care or will want a 
more aggressive cache pre-population.   Also, putting in a slowdown 
for ACKs might also be able to prevent some inadvertent DOS attacks 
due to misconfiguration - I can't think of an exact instance where 
that would be the case right now, but I've seen some really whacky 
dialplans that might cause lots of headaches for peers.

JT