[asterisk-dev] [Code Review] 2930: pbx.c: eliminate confusion when matching caller id and prevent removal of extension entry still in hashtable

Scott Griepentrog reviewboard at asterisk.org
Wed Oct 23 11:56:00 CDT 2013



> On Oct. 23, 2013, 4:39 a.m., Guenther Kelleter wrote:
> > I don't know whats wrong here, but I can crash asterisk with this patch on top of asterisk 11.2-cert2/Rev:397839 after a while in strcmp() with the dialplan from AST-1235 and executing "core restart gracefully" every 5 seconds:
> > 
> > (gdb) bt
> > #0  0x40088c10 in strcmp () from /lib/libc.so.6
> > #1  0x08144206 in ast_context_remove_extension_callerid2 (con=0x41418ed0, extension=0xba9ff014 "60.", priority=1,
> >     callerid=0xba9fefc4 "", matchcallerid=0, registrar=0x0, already_locked=1) at pbx.c:6976
> > #2  0x08152ab1 in __ast_context_destroy (list=0x88a3f78, contexttab=0x41414d00, con=0x0, registrar=0x413115a0 "pbx_config")
> >     at pbx.c:10607
> > #3  0x08152d21 in ast_context_destroy (con=0x0, registrar=0x413115a0 "pbx_config") at pbx.c:10673
> > #4  0x4130d88f in unload_module () at pbx_config.c:1407
> > #5  0x08119186 in ast_module_shutdown () at loader.c:519
> > #6  0x0807f726 in really_quit (num=0, niceness=-1163923436, restart=1) at asterisk.c:1774
> > #7  0x0807fc05 in handle_restart_gracefully (e=0xba9ff014, cmd=0, a=0x0) at asterisk.c:1692
> > #8  0x080cb91c in ast_cli_command_full (uid=0, gid=0, fd=50, s=0xba9ff3c4 "core restart gracefully") at cli.c:2560
> > #9  0x080cbac9 in ast_cli_command_multiple_full (uid=0, gid=0, fd=50, size=24, s=0xba9ff863 "core restart gracefully")
> >     at cli.c:2583
> > #10 0x0807d6aa in netconsole (vconsole=0x82276a0) at asterisk.c:1336
> > #11 0x0819580b in dummy_start (data=0xba9ff014) at utils.c:1028
> > #12 0x40515d92 in pthread_detach () from /lib/libpthread.so.0
> > #13 0x400e6e4a in clone () from /lib/libc.so.6
> > (gdb) info registers
> > eax            0x0      0
> > ecx            0x0      0                           <------
> > edx            0xba9ff014       -1163923436
> > ebx            0x414197a8       1094817704
> > esp            0xba9feae4       0xba9feae4
> > ebp            0xba9feaec       0xba9feaec
> > esi            0x0      0
> > edi            0x41419030       1094815792
> > eip            0x40088c10       0x40088c10
> > eflags         0x10202  66050
> > cs             0x73     115
> > ss             0x7b     123
> > ds             0x7b     123
> > es             0x7b     123
> > fs             0x0      0
> > gs             0x0      0
> > (gdb) disas
> > Dump of assembler code for function strcmp:
> > 0x40088c00 <strcmp+0>:  push   %ebp
> > 0x40088c01 <strcmp+1>:  mov    %esp,%ebp
> > 0x40088c03 <strcmp+3>:  push   %eax
> > 0x40088c04 <strcmp+4>:  push   %eax
> > 0x40088c05 <strcmp+5>:  mov    0x8(%ebp),%ecx
> > 0x40088c08 <strcmp+8>:  mov    0xc(%ebp),%edx
> > 0x40088c0b <strcmp+11>: nop
> > 0x40088c0c <strcmp+12>: lea    0x0(%esi),%esi
> > 0x40088c10 <strcmp+16>: mov    (%ecx),%al            <------
> > 0x40088c12 <strcmp+18>: inc    %ecx
> > 
> > (gdb) bt full 4
> > #0  0x40088c10 in strcmp () from /lib/libc.so.6
> > No symbol table info available.
> > #1  0x08144206 in ast_context_remove_extension_callerid2 (con=0x41418ed0, extension=0xba9ff014 "60.", priority=1,
> >     callerid=0xba9fefc4 "", matchcallerid=0, registrar=0x0, already_locked=1) at pbx.c:6976
> >         x = (struct match_char *) 0x0
> >         exten = (struct ast_exten *) 0x0
> >         prev_exten = (struct ast_exten *) 0x0
> >         peer = (struct ast_exten *) 0x414197a8
> >         ex = {exten = 0xba9feb14 "60.", matchcid = 0, cidmatch = 0xba9fefc4 "", priority = 1, label = 0x413115a0 "pbx_config",
> >   parent = 0xba9ff014, app = 0xba9fef5c "|?\237?\024?\237??\216AA", cached_app = 0x811c32b, data = 0x81db87a, datad = 0x296a,
> >   peer = 0x81de187, peer_table = 0x3, peer_label_table = 0x0,
> >   registrar = 0x81e0e48 "Remove %s/%s/%d, registrar=%s; con=%s(%p); con->root=%p\n", next = 0xba9fef78,
> >   stuff = 0xba9fef50 "\001"}
> >         exten2 = (struct ast_exten *) 0x0
> >         exten3 = (struct ast_exten *) 0x0
> >         dummy_name = "60.\000$\016R@@?Q at X\016R@@\016R@`?Q at H\016R@\000\000\000\000\002\000\000\000\000\000\000\000nD\b@\025\000\000\000.?Q@??Q@?\002\216@\020\000p@\204?\237???Q@\020\000p@\020\000p@\204?\237??sQ@ \000p@\020\000p at d\202\023@?rQ at d\202\023@?\002\216@\234?\237?\227g\b@\020\000p@\017\000\000\000\204?\237?\000\001\000\000??\237?\006P\030\b?\002\216@\236?Q@?\002\216@\204?\237?\236?Q@\016?Q@??\237?\016?Q at H\230\037\bX\213Q at p\030\000\000??Q@\210w\037\b*\227AA"...
> >         previous_peer = (struct ast_exten *) 0x41418f50
> >         next_peer = (struct ast_exten *) 0x0
> >         found = 1
> >         __PRETTY_FUNCTION__ = "ast_context_remove_extension_callerid2"
> > #2  0x08152ab1 in __ast_context_destroy (list=0x88a3f78, contexttab=0x41414d00, con=0x0, registrar=0x413115a0 "pbx_config")
> >     at pbx.c:10607
> >         extension = "60.\000??Q@\020\000p@\020\000p at D?\237??sQ@??\237???Q@??\237?\000\212\037\b\\?\237??oQ@\\?\237?\227g\b@\020\000p at d\202\023@\b?E\b??D\bl?\237?g \017@"
> >         cidmatch = "\000\016R@`?Q at H\016R@??\023@\236?Q@$\016R@@?Q at X\016R@@\016R@`?Q at H\016R@\020\000p@$?\237?\000\000\000\000nD\b@\035\000\000\000.?Q@\236?Q@?LAA\020\000p@"
> >         end_traversal = 1
> >         tmp = (struct ast_context *) 0x41418ed0
> > ---Type <return> to continue, or q <return> to quit---
> >         tmpl = (struct ast_context *) 0x88a3f78
> >         exten_item = (struct ast_exten *) 0x0
> >         prio_item = (struct ast_exten *) 0x41419a38
> >         __PRETTY_FUNCTION__ = "__ast_context_destroy"
> > #3  0x08152d21 in ast_context_destroy (con=0x0, registrar=0x413115a0 "pbx_config") at pbx.c:10673
> > No locals.
> > (More stack frames follow...)
> > 
> > -----------------------------------------------
> > And this crash also happens very often in this test:
> > 
> > (gdb) bt
> > #0  0x0814879d in ast_unregister_application (app=0x4067fc7f "Monitor") at pbx.c:8403
> > #1  0x4067f8d0 in unload_module () at res_monitor.c:939
> > #2  0x08119186 in ast_module_shutdown () at loader.c:519
> > #3  0x0807f726 in really_quit (num=0, niceness=1099017792, restart=1) at asterisk.c:1774
> > #4  0x0807fc05 in handle_restart_gracefully (e=0x4181ae40, cmd=1099007104, a=0x41818480) at asterisk.c:1692
> > #5  0x080cb91c in ast_cli_command_full (uid=0, gid=0, fd=50, s=0xba3ff3c4 "core restart gracefully") at cli.c:2560
> > #6  0x080cbac9 in ast_cli_command_multiple_full (uid=0, gid=0, fd=50, size=24, s=0xba3ff863 "core restart gracefully")
> >     at cli.c:2583
> > #7  0x0807d6aa in netconsole (vconsole=0x82276a0) at asterisk.c:1336
> > #8  0x0819580b in dummy_start (data=0x4181ae40) at utils.c:1028
> > #9  0x40515d92 in pthread_detach () from /lib/libpthread.so.0
> > #10 0x400e6e4a in clone () from /lib/libc.so.6
> > 
> > this is here (comments by me):
> > static void unreference_cached_app(struct ast_app *app)
> > {
> >         struct ast_context *context = NULL;
> >         struct ast_exten *eroot = NULL, *e = NULL;
> > 
> >         ast_rdlock_contexts();        // take a read-lock...
> >         while ((context = ast_walk_contexts(context))) {
> >                 while ((eroot = ast_walk_context_extensions(context, eroot))) {
> >                         while ((e = ast_walk_extension_priorities(eroot, e))) {
> > //line 8304---->                if (e->cached_app == app)
> >                                         e->cached_app = NULL;    // ...and then write???
> >                         }
> >                 }
> >         }
> >         ast_unlock_contexts();
> > 
> >         return;
> > }
> > 
> > There is write operations under a read-lock? 
> >
> 
> rmudgett wrote:
>     If you look at the ast_rdlock_contexts() you will see that the contexts lock is not a rd/wr lock but actually a mutex.  It has not been a rd/wr lock for quite some time.  So something else is going on.

There is a mistake in that patch.  However, even after correcting that I'm seeing additional problems via valgrind, even though it's not crashing for me.  I will submit another patch after I've resolved these issues.


- Scott


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/2930/#review9980
-----------------------------------------------------------


On Oct. 21, 2013, 4:18 p.m., Scott Griepentrog wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/2930/
> -----------------------------------------------------------
> 
> (Updated Oct. 21, 2013, 4:18 p.m.)
> 
> 
> Review request for Asterisk Developers and Guenther Kelleter.
> 
> 
> Bugs: AST-1235
>     https://issues.asterisk.org/jira/browse/AST-1235
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> This patch fixes a bug report where restarting Asterisk repeatedly would cause a crash due to accidentally deleting extensions that were still referenced in the context's root_table.  This occurred due to confusion over matching or not matching extensions based on callerid, which is also being cleared up via an enum'd third state.
> 
> There are two values in an extension entry:  matchcid which says whether cid should be matched, and cidmatch which is the string value to match.  The matchcid value was previously non-zero when cidmatch was signficant.  This is important since a null length callerid match must be kept distinct from a match-anything condition.
> 
> This patch adds an enum to define the AST_EXT_MATCHCID_OFF (0) or AST_EXT_MATCHCID_ON (1) behaviors, and adds AST_EXT_MATCHCID_ANY (2) which allows matching to both exten->matchcid=0 and exten->matchcid=1 cases.  For example, all exten's with the same exten (number) can be deleted with ANY, but when looking for a dialplan match it must match to matchcid, and if set to ON, further match cidmatch.
> 
> The logic to match extensions has been reworked to implement these three types, and then matchcid values updated to use the enum for clarity.
> 
> The most significant change is in ast_context_remove_extension_callerid2() where after removing extensions from the hashtable, the deletion would destroy the extensions.  This was deleting exten's that didn't have matchcid set but matched a null cidmatch, where they hadn't been removed from the hashtable.
> 
> - (!callerid || !matchcallerid || (matchcallerid && !strcmp(peer->cidmatch, callerid))) &&
> + (!callerid || (!matchcallerid && !peer->matchcid) || (matchcallerid && peer->matchcid && !strcmp(peer->cidmatch, callerid))) &&
> 
> 
> Diffs
> -----
> 
>   /branches/1.8/include/asterisk/pbx.h 400531 
>   /branches/1.8/main/pbx.c 400531 
> 
> Diff: https://reviewboard.asterisk.org/r/2930/diff/
> 
> 
> Testing
> -------
> 
> This patch has been tested with the testsuite additions dialplan_reload and pbx/callerid_match (posted separately) to insure that the crash case has been resolved without any other unwanted changes in behavior.
> 
> 
> Thanks,
> 
> Scott Griepentrog
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20131023/720eaae1/attachment-0001.html>


More information about the asterisk-dev mailing list