[asterisk-dev] Viva Chan_Sip, may it rest in peace

Corey Farrell git at cfware.com
Tue Oct 11 14:11:47 CDT 2016


On Mon, Oct 10, 2016 at 10:39 AM, Matthew Jordan <mjordan at digium.com> wrote:
>
> On Fri, Oct 7, 2016 at 10:31 AM, Corey Farrell <git at cfware.com> wrote:
> > Many people don't like chan_sip, most people hate working with the code.
> > The rush to throw out chan_sip when PJSIP isn't ready to be the only SIP
> > stack annoys me a bit.  Nobody is forcing anyone to use or contribute to
> > chan_sip.  Digium changed chan_sip from core to extended support so they can
> > significantly reduce involvement.  At some point chan_sip will fizzle out
> > but that hasn't happened yet.
>
> I don't think we're rushing - if anything, most people at DevCon (and
> I'd also say in this conversation) are being pretty realistic about
> how long it will take before chan_sip could be removed from the source
> tree. I think all we're seeing right now is a conversation about how
> we might eventually get there.

I wasn't at DevCon this year, sorry I latched onto the harshest
statements.  I should have ignored these comments knowing that Digium
will not just pull the rug out.

> Generally, when a module is first marked as being 'deprecated', we
> usually let it go through at least one additional release before it
> would ever be removed. Even then, our preference is to just leave the
> module in the source tree (usually disabled), in case someone still
> wants to use it. About the only time we remove a module is when it
> either no longer compiles, or when it is actively causing harm.
> Assuming we all decided to mark chan_sip as deprecated in Asterisk 15,
> then soonest it would be removed from the source tree is Asterisk 17 -
> so I think we've got some time to hash this out.
>
> As far as extended support is concerned - yes, Digium is not
> interested in maintaining chan_sip any longer. Our commercial products
> and interests are focused on the PJSIP stack. As such, once Asterisk
> 11 leaves bug fix support, I would expect our fixing of chan_sip
> related issues to drop dramatically - even more than they already
> have. Patches submitted for chan_sip through Gerrit will still be
> reviewed and receive attention, as will security patches.

Ack.  As long as patches are accepted and chan_sip is kept in the
source tree while maintainers exist, this is good enough for me.

> > Last time I checked about 200 of the PJSIP testsuite tests produced AO2
> > leaks, in many cases hundreds of objects were leaked [1].  Some of these
> > leaks may be caused by bugs in tests and I realize that many use cases must
> > work without major leaking, but some use cases could cause failures.  I've
> > submitted some fixes but I find PJSIP leak tracing more difficult than other
> > parts of Asterisk.
>
> I'll admit that some of the tracing is a bit more challenging, both
> with sorcery as well as with PJSIP memory pools.

I have made no attempts to check PJSIP memory pools, so far for PJSIP
I've only ever looked at AO2 leaks.  No point doing
MALLOC_DEBUG/valgrind checking until AO2 leaks are addressed.

> My impression right now is that any objects that are not released on
> shutdown are configuration objects or other similar objects that
> allocated a single time during some PJSIP module initialization. While
> those make AO2 debugging challenging (and it would be great to clean
> them up), they don't indicate any harmful memory leak.

I'll need to see what the current results show (last run was 3 months
ago).  These startup leaks may not be directly harmful to production,
but now or in the future a harmful leak will exist.  Hidden in the
haystack is the problem here.  Automated testing can only pass or
fail, since no automated process can determine what is a startup leak
and what is harmful, this blocks leak testing.

> Of course, I do understand if that means you don't want to use those modules.

This has no immediate impact on my plans, I am not considering PJSIP
before 15 (possibly even 17).  I'm raising this concern in the hope
that we can try to improve the situation before chan_sip becomes
non-viable.  This has an impact on testing ARI as well.  I assume at
some point existing tests outside tests/channels will be converted to
use chan_pjsip instead of chan_sip (pbx apps/funcs, AMI, etc).  Long
term this could become a leak testing issue system-wide.

> > The current policy of allowing new features into released LTS branches is a
> > concern for me with PJSIP.  If I were to start using PJSIP I would have to
> > worry about each 13.x.0 release having a new PJSIP feature possibly cause a
> > bug.  A lot of bad things can be said about chan_sip, but new features are
> > extremely unlikely in 13.12.0.  It would be nice if a core set of PJSIP and
> > other modules could be declared LTS frozen.  During LTS releases these
> > modules would be strictly bug fix only.  I suspect this is not yet wanted or
> > possible for PJSIP modules, but hope it can be re-evaluated before new LTS
> > branches.  My hope is that eventually a basic PJSIP PBX could be run using
> > only frozen modules.  Users with a specific needs or higher risk tolerance
> > could run some / all of the unfrozen modules to get more advanced or less
> > mature features.  Eventually the list of frozen modules could grow as each
> > module becomes feature complete and is proven stable.
>
> I don't think I want to complicate the new feature development process.
>
> While I think we've had a few hiccups, by and large, I haven't heard
> of a lot of complaints with the new features/improvements that are
> being released mid-stream. The only ones that have bitten me are where
> we combined modules or introduced a new needed module (res_pjproject),
> and that's because I explicitly load modules that I use. Are there
> specific issues you're thinking of?

I can't name specific issues because I've never used PJSIP outside the
testsuite.  My view is that 'a few hiccups' in LTS is a few too many.
The motivation behind 99% of my contributions to Asterisk was to
eliminate risks, so I'm not being inconsistent.  The current policy is
great as it applies to standard releases, my complaint is specifically
about LTS.  The Wikipedia article about LTS explains my view better
than I can [2].

How does the policy for new features apply to 13 now that 14 is
released?  Transitioning to the Wikipedia definition of LTS now that
it is no longer 'current' would be a good compromise.  This way the
most current release branch of Asterisk can always provide rapid
release of new features, but we always have a feature frozen LTS
version.  node.js does something like this, v6 was released in April
but isn't scheduled to become LTS until next week when v7 is released.

[2] https://en.wikipedia.org/wiki/Long-term_support



More information about the asterisk-dev mailing list