[asterisk-bugs] [Zaptel 0011661]: ztcfg default behavior + races in zaptel drivers + unsuspecting admins cause kernel crashes

noreply at bugs.digium.com noreply at bugs.digium.com
Fri Jan 11 19:46:50 CST 2008


A NOTE has been added to this issue. 
====================================================================== 
http://bugs.digium.com/view.php?id=11661 
====================================================================== 
Reported By:                sim
Assigned To:                
====================================================================== 
Project:                    Zaptel
Issue ID:                   11661
Category:                   Core-General
Reproducibility:            always
Severity:                   minor
Priority:                   normal
Status:                     new
Zaptel Version:             1.4.7.1 
SVN Branch (only for SVN checkouts, not tarball releases): N/A  
SVN Revision (number only!):  
Disclaimer on File?:        N/A 
Request Review:              
====================================================================== 
Date Submitted:             12-31-2007 18:29 CST
Last Modified:              01-11-2008 19:46 CST
====================================================================== 
Summary:                    ztcfg default behavior  + races in zaptel drivers +
unsuspecting admins cause kernel crashes
Description: 
I have witnessed more than 5 different people accidentally run "ztcfg" on
production systems in an attempt to try to poke at zaptel to find the
source of an issue.  One runs through "zttool", "ztdiag", "zttest", etc.,
and usually "ztcfg" if they do not know better, expecting it to behave
similar to "ifconfig".  This _should_ just reset the PRIs (in a perfect
world), but instead it tends to cause a kernel Oops in zt_init_tone_state
due to races in zaptel when channels are active.

I bet this has been causing people to kill their Asterisk servers around
the globe for many years now, and I think this is serious enough to warrant
an interface change.  I propose we either change "ztcfg" to require an
argument before applying any changes, or we make it do something similar to
checking for Asterisk running first, and not apply configuration if so
unless an override argument is specified.

If agreed, I can easily write and submit a patch -- just let me know.
====================================================================== 

---------------------------------------------------------------------- 
 sim - 01-11-08 19:46  
---------------------------------------------------------------------- 
Hi tzafrir,

I do not have a way of simplifying the steps and narrow down the crash to
a single, reproducible case.  I have only seen this in production with live
calls, and I do not have any test PRI hardware to reproduce.  In summary,
the crash has happened each time somebody runs "ztcfg" (accidentally,
causing a PRI reset, hence my original request to alter ztcfg, which I
still think needs changing, because it will still kill calls even with the
races fixed).

Recently, the Zaptel crashes which have happened has been both on a box
with one active PRI and on a box with three active PRIs, with probably at
about 60% channel utilization on each of them.  When "ztcfg" is run, the
kernel Oopses almost immediately, with a backtrace similar to
http://bugs.digium.com/view.php?id=10593 .  The only backtraces I was able
to capture seem to have a corrupted stack, because the call path seems to
be nonsensical.  As I have no way of reproducing this outside of production
at this time, I can't really get any more information.

However, it should be fairly easy to reproduce this by simply setting up
two boxes with a PRI crossover, starting some fake test calls, and running
"ztcfg" on one of the boxes.  However, you may have to declare it
production before the bug shows up. ;) 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
01-11-08 19:46  sim            Note Added: 0076783                          
======================================================================




More information about the asterisk-bugs mailing list