[asterisk-dev] [Code Review] 2880: CDRs: prevent backend modules from unregistering while active CDRs are in flight

Matt Jordan reviewboard at asterisk.org
Wed Oct 2 11:31:13 CDT 2013



> On Sept. 27, 2013, 3:25 p.m., Corey Farrell wrote:
> > /branches/12/main/cdr.c, lines 2590-2603
> > <https://reviewboard.asterisk.org/r/2880/diff/1/?file=46334#file46334line2590>
> >
> >     Any call to ast_cdr_unregister done from a config reload needs to succeed - cdr_manager, cdr_odbc, cdr_mysql do this.  For disabled modules ast_cdr_unregister / module_unload should not be blocked.
> >     
> >     Returning an int instead of void doesn't cause build warnings or errors for existing unmodified cdr modules.  Any such module would crash on cli module unload with active_count != 0.
> >     
> >     A parameter could be added to force ast_cdr_unregister.  The other idea I have is expose a function to retreive active_count, leave it up to each module unload to decide if it should continue.
> 
> Matt Jordan wrote:
>     {quote}
>     Any call to ast_cdr_unregister done from a config reload needs to succeed - cdr_manager, cdr_odbc, cdr_mysql do this.  For disabled modules ast_cdr_unregister / module_unload should not be blocked.
>     {quote}
>     
>     In general, if they are reloading they shouldn't be unregistering. Some are trying to do this because the CDR engine is disabled - but that logic should be handled exclusively in the CDR engine. Others are attempting to do it as some form of quasi-cleanup on a failed reload; however, for the most part this cleanup is either not going to occur or would be unsuccessful in the first place.
>     
>     I'm not sure what you mean by "disabled modules". Modules that decline load due to failed configuration shouldn't be registered in the first place, so it should be a non-issue.
>     
>     In general, I disagree with letting an unregister forcibly remove itself: if you have billing data in flight, you shouldn't be unregistering a backend. The whole point of all of the ridiculous lengths Asterisk goes to in trying to get the billing data to be written is specifically to avoid losing it.
>     
>     {quote}
>     Returning an int instead of void doesn't cause build warnings or errors for existing unmodified cdr modules.  Any such module would crash on cli module unload with active_count != 0.
>     {quote}
>     
>     Asterisk 12 isn't released yet - it's in Alpha - if someone attempts to take a CDR module from 12.0.0-alpha1 and run it with Asterisk 12.0.0-beta1, that's a bad assumption. Don't do that.
>     
>     {quote}
>     A parameter could be added to force ast_cdr_unregister.  The other idea I have is expose a function to retreive active_count, leave it up to each module unload to decide if it should continue.
>     {quote}
>     
>     I really don't care for that. Forcing an unregister would most likely make the most "sense" during Asterisk shutdown, which is the whole point of this patch. You shouldn't unregister a CDR backend when you have data that still needs to be written: period. Letting each module make the decision means we'll have different behavior in each CDR module, which again, is a "bad thing".
>     
>
> 
> Corey Farrell wrote:
>     In cdr_manager.conf, [general] enabled=no or cdr_odbc.conf with no [global] section. These both cause ast_cdr_unregister (cdr engine is enabled but the module is loaded and disabled).
>     
>     Other modules do not ast_cdr_unregister because of config, but they can be configured to record nothing:
>     cdr_custom.conf - no [mappings]
>     cdr_adaptive_odbc.conf - blank or comment only
>     cdr_syslog.conf - no sections with template parameter
>     
>     These 5 cases are from the sample configs, the modules do not fail or decline load.

Blech.

On looking at cdr_mysql some more, I'm reevaluating my adamant position against a 'temporary' suspension - it does a lot of 'stuff' during a reload that, if a CDR came in to be processed on another thread, would result in some poor behavior. cdr_manager would be easier to handle - the enabling/disabling of CDRs could be handled when a CDR is posted to the backend - however, since cdr_mysql needs some temporary suspension behavior as well, it's probably best to just add that.

I'll go ahead and let a CDR backend to suspend itself, as opposed to be removing from the list. That's fairly easy to add, and wouldn't affect the purpose of this patch - and should address the finding.


- Matt


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/2880/#review9823
-----------------------------------------------------------


On Sept. 25, 2013, 4:07 p.m., Matt Jordan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/2880/
> -----------------------------------------------------------
> 
> (Updated Sept. 25, 2013, 4:07 p.m.)
> 
> 
> Review request for Asterisk Developers.
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> During shutdown, Asterisk attempts to flush out any active CDRs by calling ast_cdr_engine_term. This is to try to get as much billing data recorded as possible before Asterisk shuts down/restarts and that data is lost. Unfortunately, module unloading occurs prior to calling ast_cdr_engine_term, and the CDR backend modules have already unregistered themselves. Thus, while the records are successfully flushed from the system, there are no registered backends left to handle the data.
> 
> This is technically a problem in prior versions of Asterisk as well; however, it is far less likely to occur in Asterisk 1.8/11 due to CDRs being build on an asynchronous message bus in Asterisk 12.
> 
> This patch modified ast_cdr_unregister such that if active_cdrs_by_channel has an entry, the unregistration fails. This would have the side effect of preventing CDR module unloading via the CLI on a busy system - but if you're doing that, you're already losing billing data, so that seems like a very bad idea anyway.
> 
> 
> Diffs
> -----
> 
>   /branches/12/main/cdr.c 399775 
>   /branches/12/include/asterisk/cdr.h 399775 
>   /branches/12/cdr/cdr_tds.c 399775 
>   /branches/12/cdr/cdr_syslog.c 399775 
>   /branches/12/cdr/cdr_sqlite3_custom.c 399775 
>   /branches/12/cdr/cdr_sqlite.c 399775 
>   /branches/12/cdr/cdr_pgsql.c 399775 
>   /branches/12/cdr/cdr_radius.c 399775 
>   /branches/12/cdr/cdr_odbc.c 399775 
>   /branches/12/cdr/cdr_manager.c 399775 
>   /branches/12/cdr/cdr_custom.c 399775 
>   /branches/12/cdr/cdr_csv.c 399775 
>   /branches/12/cdr/cdr_adaptive_odbc.c 399775 
> 
> Diff: https://reviewboard.asterisk.org/r/2880/diff/
> 
> 
> Testing
> -------
> 
> The various dial tests that have a tendency to end very quickly while CDR data is in flight now pass consistently.
> 
> 
> Thanks,
> 
> Matt Jordan
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20131002/09114670/attachment-0001.html>


More information about the asterisk-dev mailing list