[asterisk-bugs] [Asterisk 0010347]: Asterisk Crashes in cdr_csv.c during csv_log while trying to close the log file

noreply at bugs.digium.com noreply at bugs.digium.com
Tue Jul 31 22:57:29 CDT 2007


A NOTE has been added to this issue. 
====================================================================== 
http://bugs.digium.com/view.php?id=10347 
====================================================================== 
Reported By:                explidous
Assigned To:                
====================================================================== 
Project:                    Asterisk
Issue ID:                   10347
Category:                   CDR/cdr_csv
Reproducibility:            always
Severity:                   crash
Priority:                   normal
Status:                     new
Asterisk Version:            SVN 
SVN Branch (only for SVN checkouts, not tarball releases):  trunk 
SVN Revision (number only!): 77826 
Disclaimer on File?:        N/A 
Request Review:              
====================================================================== 
Date Submitted:             07-31-2007 16:13 CDT
Last Modified:              07-31-2007 22:57 CDT
====================================================================== 
Summary:                    Asterisk Crashes in cdr_csv.c during csv_log while
trying to close the log file
Description: 
Under moderately high load Asterisk crashes while trying to close the the
log file at line 296 in cdr_csv.c. It seems that two processes call this
function at the same time and try to close the file descriptor twice. This
is most likely because a lock was removed somewhere in Asterisk.

To fix this I made the module only close the file on unload and reload and
open the file on load and reload and put a lock around where it actually
writes to the file. Previously Asterisk opened the file, wrote to the file,
flushed the buffer, then closed the file. This was most likely due to the
fact that fflush() had problems in the past not properly flushing the
buffer. This has been fixes since about 2002. Now Asterisk locks, writes to
the file, unlocks, then flushes the buffer. This also makes this function
thread safe.

I have not had Asterisk crash at this point since (12 tries so far).
====================================================================== 

---------------------------------------------------------------------- 
 explidous - 07-31-07 22:57  
---------------------------------------------------------------------- 
No objection whatsoever on locking in (un)load_module. I'll upload a new
patch soon.
writefile looks like it would not need any lock as it is, if it would, it
would be better served by a separate lock, no reason to limit parallelisms
between the two separate files. A lock in writefile is only desirable if
the OS can not handle multiple threads opening the same file and writing to
it at the "same" time, in this case the simple but not very performant
solution would be to put that separate lock around the whole fopen to
fclose section. For optimum parallel processing we would need a lock per
file or forcing files to be not opened as shared with better handling of
failing to open the file than just a "return -1". BTW I'll put an LOG_ERROR
on that fail now...
Opening and closing the file on each write should really not be necessary
anymore, there were issues where fflush was not reliably updating files
depending on which write operation was used, these have however been
resolved in most OS versions many years ago (2002).
The problem with closing and reopening for each write is, that it would
have to be done within the lock as well and that would make the time locked
much longer which negatively affects the general performance. 
I would be happy to include a compile/config option, something along the
lines of CSVLOG_PARANOID_FILE_HANDLING to switch to an alternative code
that provides per line open and closes. If anyone still knows of systems
that do not handle fflush correctly, those could be set to automatically
use the alternative code. 
I am willing to test if we can produce any data loss situation on the
current linux platforms we use. 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
07-31-07 22:57  explidous      Note Added: 0068182                          
======================================================================




More information about the asterisk-bugs mailing list