[hydra-dev] * Fault Tolerance vs High availability

Thu Jun 3 10:50:31 CDT 2010

You have to "front end" asterisk with some sort of fault tolerant
gateway that can do call tracking.

On Thu, Jun 3, 2010 at 10:45 AM, Tim Panton <thp at westhawk.co.uk> wrote:
> The specific requirement was that if a single asterisk box went down,
> then there would be no dropped calls. Calls that it had been routing
> would be re-connected (after a short delay) between the respective
> endpoints, and any in-the-middle processing (recording, conferencing
> etc) would be re-instated.
>
> (And yes, it is doable, with carefully faked re-invites ).
>
> T.
>
>
> On 3 Jun 2010, at 16:37, Ed Guy wrote:
>
>> Tim,
>>
>> I suspect portions of the competing system were fault tolerant, but
>> I suspect the system as a whole wasnt. (OTOH, I could be wrong...
>> but this a two beer bet.)
>>
>> Migrating calls by an administrative action is a bit different than full
>> fault tolerance.
>> This requirement makes a great deal of sense and perhaps could be
>> implemented
>> by sort of phased reinvite ( in sip terms)  under the direction of some
>> session supervisor.
>>
>> e.g,
>>
>> initial:
>> Alice    ---> CP30  -----> Bob
>> ( where CP30 is the audio relay/translator, etc, component  )
>>
>> intermediate
>> Alice    ---> CP30    ---> CPD2  -----> Bob
>> ( where CPD2 is the second audio relay/translator, etc, component)
>>
>> final
>> Alice    ---> CPD2  -----> Bob
>>
>> /ed
>>
>>
>> On 6/3/10 11:05 AM, Tim Panton wrote:
>>> I was one of the people who advocated support for fail-over of live calls.
>>> I had 2 reasons:
>>>      1) We had a consultancy job for a 'blue lamp' service (police/fire/etc) where it was
>>> a requirement so asterisk was rejected. I feel the Hydra should be aiming to solve
>>> problems that asterisk can't.
>>>      2) If Hydra's natural habitat is the cloud, then the ability to migrate live calls
>>> will be very handy. Imagine the case where you rolled out 100 instances
>>> to cope with a traffic spike. Once the spike is over, you can't shut down the
>>> instances until all the longer calls have finished. What you'd like is to migrate the
>>> stragglers to a couple of instances and shut everything else down.
>>>
>>> T.
>>>
>>> On 3 Jun 2010, at 15:44, Malcolm C. Davenport wrote:
>>>
>>>
>>>> Howdy,
>>>>
>>>> When this sort behavior was first discussed internally, the conclusion was that we didn't want to make decisions that would preclude our ability to provide this kind of capability.
>>>>
>>>> It seemed to the participants in that discussion that the ability to actively migration of a call from one component to another without interruption (significant, like the dropping of a call, perhaps more so than just an audio artifact or silence) would be highly valued.  Were we placing too much emphasis on something that's too esoteric?
>>>>
>>>> Cheers.
>>>>
>>>> ----- Original Message -----
>>>>
>>>>> Hydrates,
>>>>>
>>>>> During the meeting a couple months ago,
>>>>>
>>>>> there was considerable discussion about fault-tolerance.
>>>>>
>>>>> In my book, Fault tolerance implies that if there is a system failure
>>>>> on a component that involves an active call, the call is migrated,
>>>>> without significant
>>>>> interruption, to another component.
>>>>>
>>>>> is this really a requirement??? Most carrier grade systems merely
>>>>> require high availability. i.e., if a
>>>>> component fails, a call may drop, but the next call must go through. (
>>>>> "fives-nines" or better of the time )
>>>>>
>>>>> Fault tolerant architectures are very expensive and inefficient, but,
>>>>> sometimes you cant afford any failure.
>>>>>
>>>>>
>>>>> /ed
>>>>>
>>>>>
>>>>> _______________________________________________ Project Hydra
>>>>> Development Discussion List
>>>>> NOTE: All content you receive from this list is should be treated as
>>>>> confidential.
>>>>>
>>>>> To UNSUBSCRIBE or update options visit:
>>>>> http://lists.digium.com/mailman/listinfo/hydra-dev
>>>>>
>>>> --
>>>> --------------------------------------------------
>>>> Malcolm Davenport
>>>> Digium, Inc. | Senior Product Manager
>>>> 445 Jan Davis Drive NW - Huntsville, AL 35806 - US
>>>> Tel: +1 256 428 6252
>>>> Fax: +1 256 864 0464
>>>> malcolmd at digium.com
>>>>
>>>>
>>>> _______________________________________________
>>>> Project Hydra Development Discussion List
>>>> NOTE: All content you receive from this list is should be treated as confidential.
>>>>
>>>> To UNSUBSCRIBE or update options visit:
>>>>  http://lists.digium.com/mailman/listinfo/hydra-dev
>>>>
>>> Tim Panton - Web/VoIP consultant and implementor
>>> www.westhawk.co.uk
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Project Hydra Development Discussion List
>>> NOTE: All content you receive from this list is should be treated as confidential.
>>>
>>> To UNSUBSCRIBE or update options visit:
>>>   http://lists.digium.com/mailman/listinfo/hydra-dev
>>>
>>
>> _______________________________________________
>> Project Hydra Development Discussion List
>> NOTE: All content you receive from this list is should be treated as confidential.
>>
>> To UNSUBSCRIBE or update options visit:
>>   http://lists.digium.com/mailman/listinfo/hydra-dev
>
> Tim Panton - Web/VoIP consultant and implementor
> www.westhawk.co.uk
>
>
>
>
> _______________________________________________
> Project Hydra Development Discussion List
> NOTE: All content you receive from this list is should be treated as confidential.
>
> To UNSUBSCRIBE or update options visit:
>   http://lists.digium.com/mailman/listinfo/hydra-dev
>

-- 

Chris Tooley
mobile: 615-525-8067
Instant Messenger
MSN: ctooley at ntrc.net
AIM: mrchristooley
Yahoo: mrchristooley
Google Talk: ctooley at gmail.com