[asterisk-users] Kernel Panic in wct4xxp during unload on Zaptel-1.4.4

James FitzGibbon james.fitzgibbon at gmail.com
Tue May 22 07:07:19 MST 2007


I attempted an upgrade of our production system from Asterisk/Zaptel 1.2 to
1.4 this weekend.  Intially everything looked like it was working properly,
but some time in the day following the upgrade, the system died to a kernel
panic.  I wasn't able to catch the entire kernel dump on the console
unfortunately.

I attempted to isolate the panic, and found that when 'service zaptel stop'
was run (specifically, when wct4xxp was unloaded) I get this panic
consistently:

<5> Not prepped yet! (repeated approx 250 times)
<5> Freed a Wildcard
<5> Not prepped yet! (repeated approx 550 times)
<5> Stopped TE4XXP, Turned off DMA
<5> Not prepped yet! (repeated approx 11000 times)
<5> Unable to handle kernel paging request at ffffff0000034010 RIP:
<5> <ffffffffa0163207>{:wct4xxp:t4_interrupt_gen2+63}
<5> PML4 4ea063 PGD 1388067 PMD 1389067 PTE 0
<5> Oops: 0000 [1] SMP
<5> CPU 1
<5> Modules linked in: zttranscode(U) wct4xxp(U) zaptel(U) crc_ccitt
netconsole md5 ipv6 dm_mirror dm_mod button battery ac joydev ehci_hcd
uhci_hcd hw_random bnx2 ext3 jbd
cciss sd_mod scsi_mod
<5> Pid: 11053, comm: hotplug Not tainted 2.6.9-42.0.8.ELsmp
<5> RIP: 0010:[<ffffffffa0163207>]
<ffffffffa0163207>{:wct4xxp:t4_interrupt_gen2+63}
<5> RSP: 0000:00000100013ebdb0  EFLAGS: 00010046
<5> RAX: ffffff0000034000 RBX: 0000010073478724 RCX: 0000000000000002
<5> RDX: 0000010073478680 RSI: 0000000000000002 RDI: 0000010073478724
<5> RBP: 0000010073478680 R08: 0000000000000008 R09: 0000000000000000
<5> R10: 0000000000000000 R11: 0000000000000002 R12: 00000000000000d1
<5> R13: 00000100013ebec8 R14: 00000100013ebec8 R15: 0000010075e94978
<5> FS:  0000002a955643e0(0000) GS:ffffffff804e5900(0000)
knlGS:0000000000000000
<5> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<5> CR2: ffffff0000034010 CR3: 00000000013d8000 CR4: 00000000000006e0
<5> Process hotplug (pid: 11053, threadinfo 000001007413c000, task
000001007c445030)
<5> Stack: 00000000000000d1 000001007413dc98 ffffffff80138552
0000003000000008
<5>        00000100013ebea8 00000100013ebde8 0000000000000001
00000000000000d1
<5>        0000000000000012 0000010073478680
<5> Call Trace:<IRQ> <ffffffff80138552>{printk+141}
<ffffffff80112f4a>{handle_IRQ_event+41}
<5>        <ffffffff801131c4>{do_IRQ+197}
<ffffffff80110833>{ret_from_intr+0}
<5>        <ffffffff8013c731>{__do_softirq+77}
<ffffffff8013c7e5>{do_softirq+49}
<5>        <ffffffff80110bf5>{apic_timer_interrupt+133}  <EOI>
<ffffffff8011c21a>{flush_tlb_page+44}
<5>        <ffffffff80169106>{do_wp_page+1127}
<ffffffff80123ed3>{do_page_fault+575}
<5>        <ffffffff80169ff2>{handle_mm_fault+1228}
<ffffffff80123e9a>{do_page_fault+518}
<5>        <ffffffff8011026a>{system_call+126}
<ffffffff80132bc6>{schedule_tail+202}
<5>        <ffffffff80110d91>{error_exit+0}
<5>
<5> Code: 8b 40 10 89 44 24 58 e8 3d 80 1a e0 31 c0 f6 44 24 58 07 0f
<5> RIP <ffffffffa0163207>{:wct4xxp:t4_interrupt_gen2+63} RSP
<00000100013ebdb0>
<5> CR2: ffffff0000034010
<5>  <0>Kernel panic - not syncing: Oops
<5>  Badness in panic at kernel/panic.c:118
<5>
<5> Call Trace:<IRQ> <ffffffff80137a8a>{panic+527}
<ffffffff80110bf5>{apic_timer_interrupt+133}
<5>        <ffffffff80111aec>{oops_end+38} <ffffffff80111b07>{oops_end+65}
<5>        <ffffffff80124148>{do_page_fault+1204}
<ffffffffa0078f51>{:bnx2:bnx2_start_xmit+470}
<5>        <ffffffff802bb4cd>{netpoll_send_skb+257}
<ffffffff80110d91>{error_exit+0}
<5>        <ffffffffa0163207>{:wct4xxp:t4_interrupt_gen2+63}
<ffffffff80138552>{printk+141}
<5>        <ffffffff80112f4a>{handle_IRQ_event+41}
<ffffffff801131c4>{do_IRQ+197}
<5>        <ffffffff80110833>{ret_from_intr+0}
<ffffffff8013c731>{__do_softirq+77}
<5>        <ffffffff8013c7e5>{do_softirq+49}
<ffffffff80110bf5>{apic_timer_interrupt+133}
<5>         <EOI> <ffffffff8011c21a>{flush_tlb_page+44}
<ffffffff80169106>{do_wp_page+1127}
<5>        <ffffffff80123ed3>{do_page_fault+575}
<ffffffff80169ff2>{handle_mm_fault+1228}
<5>        <ffffffff80123e9a>{do_page_fault+518}
<ffffffff8011026a>{system_call+126}
<5>        <ffffffff80132bc6>{schedule_tail+202}
<ffffffff80110d91>{error_exit+0}
<5>
<5> Badness in i8042_panic_blink at drivers/input/serio/i8042.c:987
<5>
<5> Call Trace:<IRQ> <ffffffff8024219b>{i8042_panic_blink+238}
<ffffffff80137a38>{panic+445}
<5>        <ffffffff80110bf5>{apic_timer_interrupt+133}
<ffffffff80111aec>{oops_end+38}
<5>        <ffffffff80111b07>{oops_end+65}
<ffffffff80124148>{do_page_fault+1204}
<5>        <ffffffffa0078f51>{:bnx2:bnx2_start_xmit+470}
<ffffffff802bb4cd>{netpoll_send_skb+257}
<5>        <ffffffff80110d91>{error_exit+0}
<ffffffffa0163207>{:wct4xxp:t4_interrupt_gen2+63}
<5>        <ffffffff80138552>{printk+141}
<ffffffff80112f4a>{handle_IRQ_event+41}
<5>        <ffffffff801131c4>{do_IRQ+197}
<ffffffff80110833>{ret_from_intr+0}
<5>        <ffffffff8013c731>{__do_softirq+77}
<ffffffff8013c7e5>{do_softirq+49}
<5>        <ffffffff80110bf5>{apic_timer_interrupt+133}  <EOI>
<ffffffff8011c21a>{flush_tlb_page+44}
<5>        <ffffffff80169106>{do_wp_page+1127}
<ffffffff80123ed3>{do_page_fault+575}
<5>        <ffffffff80169ff2>{handle_mm_fault+1228}
<ffffffff80123e9a>{do_page_fault+518}
<5>        <ffffffff8011026a>{system_call+126}
<ffffffff80132bc6>{schedule_tail+202}
<5>        <ffffffff80110d91>{error_exit+0}
<5> Badness in i8042_panic_blink at drivers/input/serio/i8042.c:990
<5>
<5> Call Trace:<IRQ> <ffffffff8024222d>{i8042_panic_blink+384}
<ffffffff80137a38>{panic+445}
<5>        <ffffffff80110bf5>{apic_timer_interrupt+133}
<ffffffff80111aec>{oops_end+38}
<5>        <ffffffff80111b07>{oops_end+65}
<ffffffff80124148>{do_page_fault+1204}
<5>        <ffffffffa0078f51>{:bnx2:bnx2_start_xmit+470}
<ffffffff802bb4cd>{netpoll_send_skb+257}
<5>        <ffffffff80110d91>{error_exit+0}
<ffffffffa0163207>{:wct4xxp:t4_interrupt_gen2+63}
<5>        <ffffffff80138552>{printk+141}
<ffffffff80112f4a>{handle_IRQ_event+41}
<5>        <ffffffff801131c4>{do_IRQ+197}
<ffffffff80110833>{ret_from_intr+0}
<5>        <ffffffff8013c731>{__do_softirq+77}
<ffffffff8013c7e5>{do_softirq+49}
<5>        <ffffffff80110bf5>{apic_timer_interrupt+133}  <EOI>
<ffffffff8011c21a>{flush_tlb_page+44}
<5>        <ffffffff80169106>{do_wp_page+1127}
<ffffffff80123ed3>{do_page_fault+575}
<5>        <ffffffff80169ff2>{handle_mm_fault+1228}
<ffffffff80123e9a>{do_page_fault+518}
<5>        <ffffffff8011026a>{system_call+126}
<ffffffff80132bc6>{schedule_tail+202}
<5>        <ffffffff80110d91>{error_exit+0}
<5> Badness in i8042_panic_blink at drivers/input/serio/i8042.c:992
<5>
<5> Call Trace:<IRQ> <ffffffff80242292>{i8042_panic_blink+485}
<ffffffff80137a38>{panic+445}
<5>        <ffffffff80110bf5>{apic_timer_interrupt+133}
<ffffffff80111aec>{oops_end+38}
<5>        <ffffffff80111b07>{oops_end+65}
<ffffffff80124148>{do_page_fault+1204}
<5>        <ffffffffa0078f51>{:bnx2:bnx2_start_xmit+470}
<ffffffff802bb4cd>{netpoll_send_skb+257}
<5>        <ffffffff80110d91>{error_exit+0}
<ffffffffa0163207>{:wct4xxp:t4_interrupt_gen2+63}
<5>        <ffffffff80138552>{printk+141}
<ffffffff80112f4a>{handle_IRQ_event+41}
<5>        <ffffffff801131c4>{do_IRQ+197}
<ffffffff80110833>{ret_from_intr+0}
<5>        <ffffffff8013c731>{__do_softirq+77}
<ffffffff8013c7e5>{do_softirq+49}
<5>        <ffffffff80110bf5>{apic_timer_interrupt+133}  <EOI>
<ffffffff8011c21a>{flush_tlb_page+44}
<5>        <ffffffff80169106>{do_wp_page+1127}
<ffffffff80123ed3>{do_page_fault+575}
<5>        <ffffffff80169ff2>{handle_mm_fault+1228}
<ffffffff80123e9a>{do_page_fault+518}
<5>        <ffffffff8011026a>{system_call+126}
<ffffffff80132bc6>{schedule_tail+202}
<5>        <ffffffff80110d91>{error_exit+0}

System details:

HP DL380G5
CentOS x86_64 4.4
uname: 2.6.9-42.0.8.ELsmp #1 SMP Tue Jan 30 12:18:01 EST 2007 x86_64 x86_64
x86_64 GNU/Linux
T412P quad-span card
TDM400 card with two FXS modules and two FXO modules

Original software:
Libpri 1.2.4 (RPM from atrpms)
Zaptel 1.2.16 (RPM from atrpms)
Asterisk 1.2.17 (compiled from source)

Upgraded software:
Libpri 1.4.0 (compiled from source)
Zaptel 1.4.2.1 (compiled from source)
Asterisk 1.4.4 (compiled from source)

Before installing the 1.4 files, I removed the 1.2 RPMs using 'yum remove',
so there should not have been any cruft left over.  I did notice that the
atrpms tarballs put some files in different locations.  The RPM for libpri
puts the files in /usr/lib64, while my source tarball put them in /usr/lib.
'file' indicates that they are both 'ELF 64-bit LSB shared object, AMD
x86-64, version 1 (SYSV)' though, so I don't think that this is an issue of
mixed 32 and 64 bit objects.

I tried rebuilding all of the software from cleanly extracted tarballs to no
avail, so I had to restore my backed up 1.2 configuration.  Things still
aren't working properly: when I attempt to unload the wct4xxp module, the
"Not prepped yet!" message floods the console and the 'rmmod' command
becomes hung (unkillable), but the system does not panic.  I still have to
hard reboot to get the system in a state where I can bring up * again
though.

Since the panic seems to involve the T412P interrupts, here's the output of
/proc/interrupts:

[root at pbxtel-01 ~]# cat /proc/interrupts
           CPU0       CPU1
  0:     799008     804849    IO-APIC-edge  timer
  1:          4          5    IO-APIC-edge  i8042
  8:          5          1    IO-APIC-edge  rtc
  9:          0          0   IO-APIC-level  acpi
 74:      16224       3990       PCI-MSI-X  cciss0
 90:     154441          0         PCI-MSI  eth0
169:          0          0   IO-APIC-level  uhci_hcd, ehci_hcd
177:          0          0   IO-APIC-level  uhci_hcd
185:          0          0   IO-APIC-level  uhci_hcd
193:          0          0   IO-APIC-level  uhci_hcd
201:     814161     720175   IO-APIC-level  wctdm
209:     720174     814169   IO-APIC-level  wct4xxp
233:         34         47   IO-APIC-level  uhci_hcd
NMI:    1603725    1603680
LOC:    1603015    1603015
ERR:          0
MIS:          0
[root at pbxtel-01 ~]#

(this is with the system as it is running now with the backed out to
1.2configuration)

Any thoughts?

Thanks

-- 
j.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20070522/245eec53/attachment.htm


More information about the asterisk-users mailing list