[asterisk-bugs] [JIRA] (ASTERISK-21194) chan_sip can fail to find a peer during reload

Matt Jordan (JIRA) noreply at issues.asterisk.org
Mon Mar 4 09:52:19 CST 2013


    [ https://issues.asterisk.org/jira/browse/ASTERISK-21194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=203646#comment-203646 ] 

Matt Jordan commented on ASTERISK-21194:
----------------------------------------

{quote}
It's good to know work on this is happening. The risk seems very small for the config being in limbo at the time it's required.

Matt, I hear, and respect what you're saying about the reload operation being what it is. My question becomes this - how far is asterisk 12 off? I have to agree with you that such a patch would be extremely intrusive based on what I've seen of the code. Especially if we're switching to a new config mechanism/framework.
Let me ask this - if I (or someone else for that matter) puts in the effort to generate the patch -
a) would it at least be given some consideration?
{quote}

So, first let's establish some of what would need to happen in order to ensure that during a reload, operations currently in-flight complete successfully, operations issued during a reload block until the reload is finished, and all blocked and subsequent operations get the new information.

# Global information is still tracked via static variables in {{chan_sip}}. Those would all have to be moved into a global ao2 object, and all references to that global information would have to obtain that object properly and dispose of it properly.
# There are multiple containers tracking dialogs and peers. They have to be properly synchronized and the lifetimes of the objects properly managed during a reload.
# Information has to be loaded from multiple configuration sources during a reload: {{sip.conf}}, {{users.conf}}, as well as realtime backends. That information has to be properly merged and done so atomically.

The question isn't really "will it be given consideration" - the question is, can the things done above be performed in such a fashion such that there is a high level of confidence that the patch does not introduce a regression? That is, can we do all of those things and know that:
# There are no crashes
# There are no deadlocks
# There are no resource/memory leaks
# Information is properly loaded/reloaded

My guess is: No. And that's not an indictment on anyone's abilities; that's just the result of 35000 lines of code being difficult to maintain.

We actually did think about tackling {{chan_sip}} using the configuration framework, and realized we'd be better off devoting development resources on the new SIP channel driver and making sure we made use of the configuration frameworks available there, rather than trying to refactor {{chan_sip}} some more (and probably breaking everything in the process).

{quote}
b) would I be able expect some level of assistence from #asterisk(-dev)?
{quote}

By assistance, would you be able to ask questions and get pointers from developers in #asterisk-dev? Absolutely!

By assistance, do you mean people will contribute code? Unknown.

If you're interested in this problem, I wouldn't start with {{chan_sip}}. Start with something that has similar problems, i.e., something that can be reloaded but isn't completely thread-safe, but is much more manageable. A few candidates:
* dnsmgr
* cdr
* res_rtp_asterisk

Note that these are still plenty challenging, but will get you involved in a refactoring effort that will give you familiarity with the various frameworks involved.

{quote}
c) who would be the best person to assist?
{quote}

In general, just ask a question. A fair number of people are familiar with the frameworks/code involved.

{quote}
d) should the patch not be merged - would someone at least still be willing to review it for me for obvious errors so that I can carry it on my own installations?
{quote}

Yes, it'd be put up for review. If the patch was deemed too intrusive for a release branch, it could still be evaluated for trunk.

{quote}
Assuming that I might not be able to get full atomicity, would a compromise where each peer is at least loaded/rejected atomically, and the global config done atomically be acceptable? In other words - rather than getting it 100%, get similar results as is current (ie, some peers can load whilst others fail), but at least make the portions that succeed more thread safe?
{quote}

I think you're going to find that doing each peer atomically and the global config atomically is just as hard as doing the whole thing. The configuration framework is designed to take care of that aspect of it for you - if you attempt to do it without the framework's assistance, you'll end up re-inventing a lot of what it already did.

                
> chan_sip can fail to find a peer during reload
> ----------------------------------------------
>
>                 Key: ASTERISK-21194
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-21194
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_sip/General
>    Affects Versions: 11.2.1
>            Reporter: Jaco Kroon
>
> During a global system reload I saw this:
> {noformat}
> [Feb 28 16:50:26] VERBOSE[2712][C-0000317a] pbx.c:     -- Executing [number at prov:5] Dial("Local/number at foo-0000377b;2", "SIP/bar/number,,") in new stack
> [Feb 28 16:50:26] VERBOSE[2712][C-0000317a] netsock2.c:   == Using SIP RTP CoS mark 5
> [Feb 28 16:50:26] ERROR[2712][C-0000317a] netsock2.c: getaddrinfo("bar", "(null)", ...): Name or service not known
> [Feb 28 16:50:26] WARNING[2712][C-0000317a] chan_sip.c: No such host: bar
> [Feb 28 16:50:26] WARNING[2712][C-0000317a] app_dial.c: Unable to create channel of type 'SIP' (cause 20 - Subscriber absent)
> {noformat}
> sip show peer (after reload):
> {noformat}
>   * Name       : bar
>   Description  : 
>   Secret       : <Not set>
>   MD5Secret    : <Not set>
>   Remote Secret: <Not set>
>   Context      : uls-makecall
>   Record On feature : automon
>   Record Off feature : automon
>   Subscr.Cont. : <Not set>
>   Language     : 
>   Tonezone     : <Not set>
>   Accountcode  : bar
>   AMA flags    : Unknown
>   Transfer mode: open
>   CallingPres  : Presentation Allowed, Not Screened
>   Callgroup    : 
>   Pickupgroup  : 
>   Named Callgr : 
>   Nam. Pickupgr: 
>   MOH Suggest  : 
>   Mailbox      : 
>   VM Extension : 8579
>   LastMsgsSent : 0/0
>   Call limit   : 2147483647
>   Max forwards : 0
>   Dynamic      : No
>   Callerid     : "" <>
>   MaxCallBR    : 384 kbps
>   Expire       : -1
>   Insecure     : no
>   Force rport  : Auto (No)
>   Symmetric RTP: No
>   ACL          : No
>   DirectMedACL : No
>   T.38 support : Yes
>   T.38 EC mode : Redundancy
>   T.38 MaxDtgrm: -1
>   DirectMedia  : No
>   PromiscRedir : No
>   User=Phone   : No
>   Video Support: No
>   Text Support : No
>   Ign SDP ver  : No
>   Trust RPID   : No
>   Send RPID    : No
>   Subscriptions: Yes
>   Overlap dial : No
>   DTMFmode     : rfc2833
>   Timer T1     : 500
>   Timer B      : 32000
>   ToHost       : 10.0.0.14
>   Addr->IP     : 10.0.0.14:5060
>   Defaddr->IP  : (null)
>   Prim.Transp. : UDP
>   Allowed.Trsp : UDP
>   Reg. exten   : 
>   Def. Username: 
>   SIP Options  : (none)
>   Codecs       : (g729)
>   Codec Order  : (g729:20)
>   Auto-Framing :  No 
>   Status       : OK (1 ms)
>   Useragent    : 
>   Reg. Contact : 
>   Qualify Freq : 60000 ms
>   Keepalive    : 0 ms
>   Variables    :
>                  __noivr = yes
>   Sess-Timers  : Accept
>   Sess-Refresh : uas
>   Sess-Expires : 1800 secs
>   Min-Sess     : 90 secs
>   RTP Engine   : asterisk
>   Parkinglot   : 
>   Use Reason   : No
>   Encryption   : No
> {noformat}
> And it would have looked exactly the same just before reload.  The section in sip.conf:
> {noformat}
> [bar]
> type=friend
> host=10.0.0.14
> qualify=yes
> disallow=all
> allow=g729
> context=uls-makecall
> directmedia=no
> dtmfmode=rfc2833
> accountcode=IS
> jbforce=no
> setvar=__noivr=yes
> transport=udp
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira



More information about the asterisk-bugs mailing list