[asterisk-bugs] [JIRA] (ASTERISK-26972) Crash in adaptive jitterbuffer

Richard Kenner (JIRA) noreply at issues.asterisk.org
Fri Apr 28 16:16:58 CDT 2017


    [ https://issues.asterisk.org/jira/browse/ASTERISK-26972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=236794#comment-236794 ] 

Richard Kenner commented on ASTERISK-26972:
-------------------------------------------

I noticed the dump two days later. so it's really only logs that I'm going by.  The crashes occurred within minutes of each other, so they must have been the same call or conference being retried.

What's wierd here is that the log on the same second as the crash says:  

[Apr 18 09:28:58] WARNING[29000][C-00008f0e] chan_iax2.c: Resyncing the jb. last_delay 0, this delay -6487433, threshold 1000, new offset 6487433

But we're not using iax2!  Although I see those messages a lot.

There are no relevant CDR entries, obviously, because the calls never completed.

I immediately "fixed" this by adding a bounds check on timer_interval, but that's not the right fix, of course.

This is very politically sensitive here because we also had crashes caused by another Asterisk crash and there's a growing movement here to abandon Asterisk "and use something that works", so I'm, unfortunately, not in a position to try to reproduce this and all the information I can get has to come from dumps and log messages.

> Crash in adaptive jitterbuffer
> ------------------------------
>
>                 Key: ASTERISK-26972
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-26972
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: General
>    Affects Versions: 14.3.0
>            Reporter: Richard Kenner
>            Assignee: Richard Kenner
>
> I've gotten three crashes with arithmetic exceptions in abstract_jb.c where 
> framedata->timer_interval is (way) over 100.  Here's the traceback for one of them:
> {noformat}
> #0  0x0000000000435018 in hook_event_cb (chan=<value optimized out>, 
>     frame=0x87a180, event=<value optimized out>, data=0x2aec10001690)
>     at abstract_jb.c:1010
> #1  0x000000000050ec25 in framehook_list_push_event (
>     framehooks=0x2aec100258b0, frame=0x1e7d3600, 
>     event=AST_FRAMEHOOK_EVENT_READ) at framehook.c:118
> #2  0x00000000004b9f0c in __ast_read (chan=0x1e71cc18, dropaudio=0)
>     at channel.c:3950
> #3  0x000000000047b9d9 in bridge_channel_handle_interval (
>     bridge_channel=0x2aec004702f8) at bridge_channel.c:1466
> #4  bridge_channel_wait (bridge_channel=0x2aec004702f8)
>     at bridge_channel.c:2619
> #5  0x000000000047c888 in bridge_channel_internal_join (
>     bridge_channel=0x2aec004702f8) at bridge_channel.c:2757
> #6  0x0000000000468a18 in ast_bridge_join (bridge=0x2aec005276d8, 
>     chan=0x1e71cc18, swap=0x0, features=0x2aec088b2b60, 
>     tech_args=<value optimized out>, flags=<value optimized out>)
>     at bridge.c:1713
> {noformat}
> and here are some things I've extracted from the dump:
> {noformat}
> (gdb) print (struct jb_framedata *) $rbp
> $6 = (struct jb_framedata *) 0x2aec10001690
> (gdb) p *$
> $7 = {jb_impl = 0x5dfa20, jb_conf = {flags = 909184, max_size = 700, 
>     resync_threshold = 1000, impl = "adaptive\000\000\000\020", 
>     target_extra = 40}, start_tv = {tv_sec = 1492522127, tv_usec = 824808}, 
>   last_format = 0x2aebcc0121f0, timer = 0x2aec1002d4c0, 
>   timer_interval = 4460210, timer_fd = 126, first = 1, jb_obj = 0x2aec10004230}
> {noformat}
> {noformat}
> (gdb) p *$6.last_format
> $8 = {name = 0x603455 "slin", codec = 0x2aebcc012110, attribute_data = 0x0, 
>   interface = 0x0}
> (gdb) print *$6.last_format->codec
> $9 = {id = 8, name = 0x603455 "slin", 
>   description = 0x60345a "16 bit Signed Linear PCM", 
>   type = AST_MEDIA_TYPE_AUDIO, sample_rate = 8000, minimum_ms = 10, 
>   maximum_ms = 70, default_ms = 20, minimum_bytes = 160, 
>   samples_count = 0x4ca520 <g726_length>, 
>   get_length = 0x4ca530 <slin_samples>, smooth = 1, mod = 0x0}
> (gdb) p/x 909184
> $10 = 0xddf80
> (gdb) p frame
> $11 = (struct ast_frame *) 0x87a180
> (gdb) print *frame
> $12 = {frametype = AST_FRAME_NULL, subclass = {integer = 0, format = 0x0, 
>     frame_ending = 0}, datalen = 0, samples = 0, mallocd = 0, 
>   mallocd_hdr_len = 0, offset = 0, src = 0x0, data = {ptr = 0x0, uint32 = 0, 
>     pad = "\000\000\000\000\000\000\000"}, delivery = {tv_sec = 0, 
>     tv_usec = 0}, frame_list = {next = 0x0}, flags = 0, ts = 0, len = 0, 
>   seqno = 0}
> {noformat}
> All three crashes were in the same place with the exact same bogus timer_interval.
> Is there anything else that would be useful to get out of this dump?  Obviously, going back in time and finding the frame that set timer_interval would be very useful, but I don't see how to get that from the dumps.  Suggestions?



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list