[hydra-dev] * Fault Tolerance vs High availability

Tim Panton thp at westhawk.co.uk
Thu Jun 3 10:45:51 CDT 2010


The specific requirement was that if a single asterisk box went down,
then there would be no dropped calls. Calls that it had been routing
would be re-connected (after a short delay) between the respective
endpoints, and any in-the-middle processing (recording, conferencing
etc) would be re-instated.

(And yes, it is doable, with carefully faked re-invites ).

T.


On 3 Jun 2010, at 16:37, Ed Guy wrote:

> Tim,
> 
> I suspect portions of the competing system were fault tolerant, but
> I suspect the system as a whole wasnt. (OTOH, I could be wrong...
> but this a two beer bet.)
> 
> Migrating calls by an administrative action is a bit different than full
> fault tolerance.
> This requirement makes a great deal of sense and perhaps could be
> implemented
> by sort of phased reinvite ( in sip terms)  under the direction of some
> session supervisor.
> 
> e.g,
> 
> initial:
> Alice    ---> CP30  -----> Bob
> ( where CP30 is the audio relay/translator, etc, component  )
> 
> intermediate
> Alice    ---> CP30    ---> CPD2  -----> Bob
> ( where CPD2 is the second audio relay/translator, etc, component)
> 
> final
> Alice    ---> CPD2  -----> Bob
> 
> /ed
> 
> 
> On 6/3/10 11:05 AM, Tim Panton wrote:
>> I was one of the people who advocated support for fail-over of live calls.
>> I had 2 reasons:
>> 	1) We had a consultancy job for a 'blue lamp' service (police/fire/etc) where it was  
>> a requirement so asterisk was rejected. I feel the Hydra should be aiming to solve
>> problems that asterisk can't.
>> 	2) If Hydra's natural habitat is the cloud, then the ability to migrate live calls
>> will be very handy. Imagine the case where you rolled out 100 instances
>> to cope with a traffic spike. Once the spike is over, you can't shut down the
>> instances until all the longer calls have finished. What you'd like is to migrate the
>> stragglers to a couple of instances and shut everything else down.
>> 
>> T.
>> 
>> On 3 Jun 2010, at 15:44, Malcolm C. Davenport wrote:
>> 
>> 
>>> Howdy,
>>> 
>>> When this sort behavior was first discussed internally, the conclusion was that we didn't want to make decisions that would preclude our ability to provide this kind of capability.  
>>> 
>>> It seemed to the participants in that discussion that the ability to actively migration of a call from one component to another without interruption (significant, like the dropping of a call, perhaps more so than just an audio artifact or silence) would be highly valued.  Were we placing too much emphasis on something that's too esoteric?
>>> 
>>> Cheers.
>>> 
>>> ----- Original Message -----
>>> 
>>>> Hydrates,
>>>> 
>>>> During the meeting a couple months ago,
>>>> 
>>>> there was considerable discussion about fault-tolerance.
>>>> 
>>>> In my book, Fault tolerance implies that if there is a system failure
>>>> on a component that involves an active call, the call is migrated,
>>>> without significant
>>>> interruption, to another component.
>>>> 
>>>> is this really a requirement??? Most carrier grade systems merely
>>>> require high availability. i.e., if a
>>>> component fails, a call may drop, but the next call must go through. (
>>>> "fives-nines" or better of the time )
>>>> 
>>>> Fault tolerant architectures are very expensive and inefficient, but,
>>>> sometimes you cant afford any failure.
>>>> 
>>>> 
>>>> /ed
>>>> 
>>>> 
>>>> _______________________________________________ Project Hydra
>>>> Development Discussion List
>>>> NOTE: All content you receive from this list is should be treated as
>>>> confidential.
>>>> 
>>>> To UNSUBSCRIBE or update options visit:
>>>> http://lists.digium.com/mailman/listinfo/hydra-dev
>>>> 
>>> -- 
>>> --------------------------------------------------
>>> Malcolm Davenport
>>> Digium, Inc. | Senior Product Manager
>>> 445 Jan Davis Drive NW - Huntsville, AL 35806 - US
>>> Tel: +1 256 428 6252
>>> Fax: +1 256 864 0464
>>> malcolmd at digium.com
>>> 
>>> 
>>> _______________________________________________
>>> Project Hydra Development Discussion List
>>> NOTE: All content you receive from this list is should be treated as confidential.
>>> 
>>> To UNSUBSCRIBE or update options visit:
>>>  http://lists.digium.com/mailman/listinfo/hydra-dev
>>> 
>> Tim Panton - Web/VoIP consultant and implementor
>> www.westhawk.co.uk
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Project Hydra Development Discussion List
>> NOTE: All content you receive from this list is should be treated as confidential.
>> 
>> To UNSUBSCRIBE or update options visit:
>>   http://lists.digium.com/mailman/listinfo/hydra-dev
>> 
> 
> _______________________________________________
> Project Hydra Development Discussion List
> NOTE: All content you receive from this list is should be treated as confidential.
> 
> To UNSUBSCRIBE or update options visit:
>   http://lists.digium.com/mailman/listinfo/hydra-dev

Tim Panton - Web/VoIP consultant and implementor
www.westhawk.co.uk







More information about the asterisk-scf-dev mailing list