[asterisk-bugs] [Asterisk 0011189]: chan_zap causing reset on E1 and eventually crashed asterisk

noreply at bugs.digium.com noreply at bugs.digium.com
Sun Mar 23 14:23:55 CDT 2008


A NOTE has been added to this issue. 
====================================================================== 
http://bugs.digium.com/view.php?id=11189 
====================================================================== 
Reported By:                freon1
Assigned To:                
====================================================================== 
Project:                    Asterisk
Issue ID:                   11189
Category:                   Channels/chan_zap
Reproducibility:            always
Severity:                   crash
Priority:                   normal
Status:                     feedback
Asterisk Version:           1.4.13  
SVN Branch (only for SVN checkouts, not tarball releases): N/A  
SVN Revision (number only!):  
Disclaimer on File?:        N/A 
Request Review:              
====================================================================== 
Date Submitted:             11-07-2007 21:03 CST
Last Modified:              03-23-2008 14:23 CDT
====================================================================== 
Summary:                    chan_zap causing reset on E1 and eventually crashed
asterisk
Description: 
I have a TE410 and TC400B installed under Fedora 7 kernel 2.6.23.1-10. with
asterisk-1.4.13, zaptel-1.4.6, libpri-1.4.2.  I have two E1 installed on
the TE410 card my calls are all going from IP to PSTN.  When I have above
40-45 calls (if it is less than 40 I havent noticed any problems) after a
while I start getting error similar to:  
[Nov  6 17:32:18] ERROR[32610] chan_zap.c: !! Got reject for frame 30,
retransmitting frame 30 now, updating n_r!
the number of errors increase and all of a sudden one of the E1s resets
and all the calls drop on that E1. The problem repeats and at some point
the other E1 resets which E1 resets is pretty random.  Then after several
of these errors the asterisk core dumps.  I have tested the hardware with
the help of digium support and have eliminated the hardware as an issue.  I
also tested the E1s and they are clean.  I also to the best of my ability
eliminated the problem being caused by conflict of any unrelated modules. 
I have the full debug (crash at 18:29:37) also the core dump both are
tar.gz(ipped).. I am available to provide further testing with live calls
if we need to get more data or test with live calls if there is a patch.
Thanks 
====================================================================== 

---------------------------------------------------------------------- 
 freon1 - 03-23-08 14:23  
---------------------------------------------------------------------- 
Matt Before this report, I started with the tech support and they tested
the hardware and said it was a software issue.  The second thing is when I
run asterisk 1.2 version I don't get any of the chan_zap errors at all. So
I don't know if it could be hardware related.  I changed the t309 value to
10000 but then it seemed like it was holding up sockets and at some point
in time it wouldn't take any calls and show all the calls hung, so I set it
back to default.    

With version 1.4.19.rc3 I didn't seen any crashes i.e core dump, but I see
"quit a bit" of the chan_zap errors (which I dont see under 1.2.26.2) and
eventually at some point I get kernel error in the syslog (I think this is
due to the TC400B card/firmware) and no voice is heard then all channels
get hung and no calls will go thru, restarting asterisk doesn't fix the
problem but rebooting the server does. The core dump crashes may have
gotten fixed due to the two memory leaks being fixed or the whole thing not
lasting long enough to see a crash.

I am at a point where I am almost giving up on the whole Digium hardware
in combination with Asterisk 1.4 and at some point I will test with 1.6
when it is released. Someone told me Sangoma hardware performs better and
they don't have a problem. I mean I have no idea where I am with this whole
thing, I am just happy I found a versions of 1.2 that is working pretty
stable except for a few resets for now.

If you really want to figure out what is going on I can have it set up for
you to login and debug while real calls are going as long as we can limit
the time as the calls are live. Thnx   

Some sample chan_zap errors:    

q931.c:3751 q931_dl_indication: link is DOWN
q931.c:3757 q931_dl_indication: activate T309 for call 32802 on channel
21
q931.c:3757 q931_dl_indication: activate T309 for call 32817 on channel
20
q931.c:3757 q931_dl_indication: activate T309 for call 32830 on channel
22
q931.c:3757 q931_dl_indication: activate T309 for call 32833 on channel
24
q931.c:3757 q931_dl_indication: activate T309 for call 32844 on channel
26
q931.c:3757 q931_dl_indication: activate T309 for call 32852 on channel
29
q931.c:3757 q931_dl_indication: activate T309 for call 32855 on channel
30
  == Primary D-Channel on span 4 down
[Mar 22 16:12:39] ERROR[2782]: chan_zap.c:8249 zt_pri_error: !! Got
I-frame while link state 2
q931.c:3772 q931_dl_indication: link is UP
q931.c:3776 q931_dl_indication: cancel T309 for call 32802 on channel 21
q931.c:3776 q931_dl_indication: cancel T309 for call 32817 on channel 20
q931.c:3776 q931_dl_indication: cancel T309 for call 32830 on channel 22
q931.c:3776 q931_dl_indication: cancel T309 for call 32833 on channel 24
q931.c:3776 q931_dl_indication: cancel T309 for call 32844 on channel 26
q931.c:3776 q931_dl_indication: cancel T309 for call 32852 on channel 29
q931.c:3776 q931_dl_indication: cancel T309 for call 32855 on channel 30
  == Primary D-Channel on span 4 up
[Mar 22 16:12:39] ERROR[2782]: chan_zap.c:8249 zt_pri_error: !! Got reject
for frame 0, retransmitting frame 0 now, updating n_r!
[Mar 22 16:12:39] ERROR[2782]: chan_zap.c:8249 zt_pri_error: !! Got reject
for frame 0, retransmitting frame 1 now, updating n_r!
[Mar 22 16:12:39] ERROR[2782]: chan_zap.c:8249 zt_pri_error: !! Got reject
for frame 0, retransmitting frame 2 now, updating n_r!
[Mar 22 16:12:39] ERROR[2782]: chan_zap.c:8249 zt_pri_error: !! Got reject
for frame 0, retransmitting frame 3 now, updating n_r!
[Mar 22 16:12:39] ERROR[2782]: chan_zap.c:8249 zt_pri_error: !! Got reject
for frame 0, retransmitting frame 4 now, updating n_r!
[Mar 22 16:12:39] ERROR[2782]: chan_zap.c:8249 zt_pri_error: !! Got reject
for frame 0, retransmitting frame 5 now, updating n_r!
[Mar 22 16:12:39] ERROR[2782]: chan_zap.c:8249 zt_pri_error: !! Got reject
for frame 0, retransmitting frame 6 now, updating n_r!

Kernel errors before the whole thing goes really bad:

Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel: Oops: 0002 [http://bugs.digium.com/view.php?id=1]
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel: SMP
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel: CPU:    1
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel: EIP:    0060:[<f894c448>]    Not tainted VLI
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel: EFLAGS: 00010286   (2.6.23.15-80.fc7
http://bugs.digium.com/view.php?id=1)
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel: EIP is at zt_tc_ioctl+0x247/0x313 [zttranscode]
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel: eax: 00000000   ebx: f78b2064   ecx: c0044a5d   edx:
00000002
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel: esi: f64f40c0   edi: 00000002   ebp: f79c6920   esp:
f668398c
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel: ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel: Process asterisk (pid: 4271, ti=f6683000
task=f65eac20 task.ti=f6683000)
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel: Stack: 00000000 f6f47900 f64f40c0 f78b2064 00000001
00000019 00700000 00701000
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:        f66839f8 00000025 08100073 f6683a11 00000000
00000000 f66839c4 c201a1d0
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:        f65ea63c c201a180 00000004 c2044a11 f7aec144
b77beae8 f6ea8cc0 f8a34d88
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
    -- Hungup 'Zap/83-1'l Trace:
  == Spawn extension (macro-expdialh6, s, 5) exited non-zero on
'IAX2/centgw112-52' in macro 'expdialh6'
  == Spawn extension (macro-expdialh6, s, 5) exited non-zero on
'IAX2/centgw112-52'
    -- Hungup 'IAX2/centgw112-52'
ele_hond_245*CLI>
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<f8a34d88>] zt_chan_ioctl+0x652/0x673 [zaptel]
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<f8a35f13>] zt_ioctl+0x116a/0x13ad [zaptel]
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c044173e>] getnstimeofday+0x30/0xbe
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c041cb3e>] lapic_next_event+0xc/0x10
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c0443646>] clockevents_program_event+0xb5/0xbc
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c0425d43>] enqueue_entity+0x2dd/0x307
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c0425797>] __check_preempt_curr_fair+0x55/0x86
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c0425d43>] enqueue_entity+0x2dd/0x307
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c0425a3f>] dequeue_entity+0xa4/0xcb
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c0425da8>] task_tick_fair+0x3b/0x60
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c044173e>] getnstimeofday+0x30/0xbe
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c041cb3e>] lapic_next_event+0xc/0x10
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c0443646>] clockevents_program_event+0xb5/0xbc
Message from syslogd@ at Sat Mar 22 16:10:47 2008 ...
ele_hond_245 kernel:  [<c044437a>] tick_program_event+0x33/0x52
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0440426>] hrtimer_interrupt+0x192/0x1bc
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c04445de>] tick_sched_timer+0x0/0xbb
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0431ccc>] irq_exit+0x53/0x6b
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c041d05a>] smp_apic_timer_interrupt+0x71/0x7d
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c061d3bc>] _read_lock_bh+0x8/0x17
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0405c2c>] apic_timer_interrupt+0x28/0x30
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c061007b>] xfrm_send_policy_notify+0x3d7/0x4fd
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c043d55c>] remove_wait_queue+0x16/0x22
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c048bafa>] free_poll_entry+0xe/0x16
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c048bb1a>] poll_freewait+0x18/0x4c
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c048be52>] do_sys_poll+0x304/0x329
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c048c77b>] __pollwait+0x0/0xac
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0427c5b>] default_wake_function+0x0/0xc
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0427c5b>] default_wake_function+0x0/0xc
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c044173e>] getnstimeofday+0x30/0xbe
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c041cb3e>] lapic_next_event+0xc/0x10
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0443646>] clockevents_program_event+0xb5/0xbc
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c044437a>] tick_program_event+0x33/0x52
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0440426>] hrtimer_interrupt+0x192/0x1bc
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c04445de>] tick_sched_timer+0x0/0xbb
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0431ccc>] irq_exit+0x53/0x6b
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c041d05a>] smp_apic_timer_interrupt+0x71/0x7d
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0425095>] update_stats_wait_end+0xd3/0xfe
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c061d679>] __reacquire_kernel_lock+0x2f/0x4b
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c04656d7>] __rmqueue+0x5e/0xac
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c041c55f>] apic_wait_icr_idle+0xe/0x15
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0425d43>] enqueue_entity+0x2dd/0x307
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c041ac3d>] native_smp_send_reschedule+0x5f/0x64
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0425797>] __check_preempt_curr_fair+0x55/0x86
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0425733>] resched_task+0x55/0x58
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c042ab43>] check_preempt_curr_fair+0x6b/0x71
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c0427c51>] try_to_wake_up+0x2ef/0x2f9
Message from syslogd@ at Sat Mar 22 16:10:48 2008 ...
ele_hond_245 kernel:  [<c04f5928>] copy_to_user+0x34/0x48
Message from syslogd@ at Sat Mar 22 16:10:49 2008 ...
ele_hond_245 kernel:  [<f8a2ef7c>] zt_chan_read+0x1e0/0x209 [zaptel]
Message from syslogd@ at Sat Mar 22 16:10:49 2008 ...
ele_hond_245 kernel:  [<c04256b4>] update_curr+0x13d/0x167
Message from syslogd@ at Sat Mar 22 16:10:49 2008 ...
ele_hond_245 kernel:  [<c046a9d6>] vma_prio_tree_insert+0x17/0x2a
Message from syslogd@ at Sat Mar 22 16:10:49 2008 ...
ele_hond_245 kernel:  [<c0471173>] vma_link+0xa5/0xc3
Message from syslogd@ at Sat Mar 22 16:10:49 2008 ...
ele_hond_245 kernel:  [<c0425095>] update_stats_wait_end+0xd3/0xfe
Message from syslogd@ at Sat Mar 22 16:10:49 2008 ...
ele_hond_245 kernel:  [<c04041be>] __switch_to+0xcb/0x149
Message from syslogd@ at Sat Mar 22 16:10:49 2008 ...
ele_hond_245 kernel:  [<c048b38d>] do_ioctl+0x4d/0x63
Message from syslogd@ at Sat Mar 22 16:10:49 2008 ...
ele_hond_245 kernel:  [<c048b5da>] vfs_ioctl+0x237/0x249
Message from syslogd@ at Sat Mar 22 16:10:49 2008 ...
ele_hond_245 kernel:  [<c048b638>] sys_ioctl+0x4c/0x64
Message from syslogd@ at Sat Mar 22 16:10:49 2008 ... 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
03-23-08 14:23  freon1         Note Added: 0084428                          
======================================================================




More information about the asterisk-bugs mailing list