[asterisk-dev] [Code Review] 2930: pbx.c: eliminate confusion when matching caller id and prevent removal of extension entry still in hashtable
Scott Griepentrog
reviewboard at asterisk.org
Wed Oct 23 11:56:00 CDT 2013
> On Oct. 23, 2013, 4:39 a.m., Guenther Kelleter wrote:
> > I don't know whats wrong here, but I can crash asterisk with this patch on top of asterisk 11.2-cert2/Rev:397839 after a while in strcmp() with the dialplan from AST-1235 and executing "core restart gracefully" every 5 seconds:
> >
> > (gdb) bt
> > #0 0x40088c10 in strcmp () from /lib/libc.so.6
> > #1 0x08144206 in ast_context_remove_extension_callerid2 (con=0x41418ed0, extension=0xba9ff014 "60.", priority=1,
> > callerid=0xba9fefc4 "", matchcallerid=0, registrar=0x0, already_locked=1) at pbx.c:6976
> > #2 0x08152ab1 in __ast_context_destroy (list=0x88a3f78, contexttab=0x41414d00, con=0x0, registrar=0x413115a0 "pbx_config")
> > at pbx.c:10607
> > #3 0x08152d21 in ast_context_destroy (con=0x0, registrar=0x413115a0 "pbx_config") at pbx.c:10673
> > #4 0x4130d88f in unload_module () at pbx_config.c:1407
> > #5 0x08119186 in ast_module_shutdown () at loader.c:519
> > #6 0x0807f726 in really_quit (num=0, niceness=-1163923436, restart=1) at asterisk.c:1774
> > #7 0x0807fc05 in handle_restart_gracefully (e=0xba9ff014, cmd=0, a=0x0) at asterisk.c:1692
> > #8 0x080cb91c in ast_cli_command_full (uid=0, gid=0, fd=50, s=0xba9ff3c4 "core restart gracefully") at cli.c:2560
> > #9 0x080cbac9 in ast_cli_command_multiple_full (uid=0, gid=0, fd=50, size=24, s=0xba9ff863 "core restart gracefully")
> > at cli.c:2583
> > #10 0x0807d6aa in netconsole (vconsole=0x82276a0) at asterisk.c:1336
> > #11 0x0819580b in dummy_start (data=0xba9ff014) at utils.c:1028
> > #12 0x40515d92 in pthread_detach () from /lib/libpthread.so.0
> > #13 0x400e6e4a in clone () from /lib/libc.so.6
> > (gdb) info registers
> > eax 0x0 0
> > ecx 0x0 0 <------
> > edx 0xba9ff014 -1163923436
> > ebx 0x414197a8 1094817704
> > esp 0xba9feae4 0xba9feae4
> > ebp 0xba9feaec 0xba9feaec
> > esi 0x0 0
> > edi 0x41419030 1094815792
> > eip 0x40088c10 0x40088c10
> > eflags 0x10202 66050
> > cs 0x73 115
> > ss 0x7b 123
> > ds 0x7b 123
> > es 0x7b 123
> > fs 0x0 0
> > gs 0x0 0
> > (gdb) disas
> > Dump of assembler code for function strcmp:
> > 0x40088c00 <strcmp+0>: push %ebp
> > 0x40088c01 <strcmp+1>: mov %esp,%ebp
> > 0x40088c03 <strcmp+3>: push %eax
> > 0x40088c04 <strcmp+4>: push %eax
> > 0x40088c05 <strcmp+5>: mov 0x8(%ebp),%ecx
> > 0x40088c08 <strcmp+8>: mov 0xc(%ebp),%edx
> > 0x40088c0b <strcmp+11>: nop
> > 0x40088c0c <strcmp+12>: lea 0x0(%esi),%esi
> > 0x40088c10 <strcmp+16>: mov (%ecx),%al <------
> > 0x40088c12 <strcmp+18>: inc %ecx
> >
> > (gdb) bt full 4
> > #0 0x40088c10 in strcmp () from /lib/libc.so.6
> > No symbol table info available.
> > #1 0x08144206 in ast_context_remove_extension_callerid2 (con=0x41418ed0, extension=0xba9ff014 "60.", priority=1,
> > callerid=0xba9fefc4 "", matchcallerid=0, registrar=0x0, already_locked=1) at pbx.c:6976
> > x = (struct match_char *) 0x0
> > exten = (struct ast_exten *) 0x0
> > prev_exten = (struct ast_exten *) 0x0
> > peer = (struct ast_exten *) 0x414197a8
> > ex = {exten = 0xba9feb14 "60.", matchcid = 0, cidmatch = 0xba9fefc4 "", priority = 1, label = 0x413115a0 "pbx_config",
> > parent = 0xba9ff014, app = 0xba9fef5c "|?\237?\024?\237??\216AA", cached_app = 0x811c32b, data = 0x81db87a, datad = 0x296a,
> > peer = 0x81de187, peer_table = 0x3, peer_label_table = 0x0,
> > registrar = 0x81e0e48 "Remove %s/%s/%d, registrar=%s; con=%s(%p); con->root=%p\n", next = 0xba9fef78,
> > stuff = 0xba9fef50 "\001"}
> > exten2 = (struct ast_exten *) 0x0
> > exten3 = (struct ast_exten *) 0x0
> > dummy_name = "60.\000$\016R@@?Q at X\016R@@\016R@`?Q at H\016R@\000\000\000\000\002\000\000\000\000\000\000\000nD\b@\025\000\000\000.?Q@??Q@?\002\216@\020\000p@\204?\237???Q@\020\000p@\020\000p@\204?\237??sQ@ \000p@\020\000p at d\202\023@?rQ at d\202\023@?\002\216@\234?\237?\227g\b@\020\000p@\017\000\000\000\204?\237?\000\001\000\000??\237?\006P\030\b?\002\216@\236?Q@?\002\216@\204?\237?\236?Q@\016?Q@??\237?\016?Q at H\230\037\bX\213Q at p\030\000\000??Q@\210w\037\b*\227AA"...
> > previous_peer = (struct ast_exten *) 0x41418f50
> > next_peer = (struct ast_exten *) 0x0
> > found = 1
> > __PRETTY_FUNCTION__ = "ast_context_remove_extension_callerid2"
> > #2 0x08152ab1 in __ast_context_destroy (list=0x88a3f78, contexttab=0x41414d00, con=0x0, registrar=0x413115a0 "pbx_config")
> > at pbx.c:10607
> > extension = "60.\000??Q@\020\000p@\020\000p at D?\237??sQ@??\237???Q@??\237?\000\212\037\b\\?\237??oQ@\\?\237?\227g\b@\020\000p at d\202\023@\b?E\b??D\bl?\237?g \017@"
> > cidmatch = "\000\016R@`?Q at H\016R@??\023@\236?Q@$\016R@@?Q at X\016R@@\016R@`?Q at H\016R@\020\000p@$?\237?\000\000\000\000nD\b@\035\000\000\000.?Q@\236?Q@?LAA\020\000p@"
> > end_traversal = 1
> > tmp = (struct ast_context *) 0x41418ed0
> > ---Type <return> to continue, or q <return> to quit---
> > tmpl = (struct ast_context *) 0x88a3f78
> > exten_item = (struct ast_exten *) 0x0
> > prio_item = (struct ast_exten *) 0x41419a38
> > __PRETTY_FUNCTION__ = "__ast_context_destroy"
> > #3 0x08152d21 in ast_context_destroy (con=0x0, registrar=0x413115a0 "pbx_config") at pbx.c:10673
> > No locals.
> > (More stack frames follow...)
> >
> > -----------------------------------------------
> > And this crash also happens very often in this test:
> >
> > (gdb) bt
> > #0 0x0814879d in ast_unregister_application (app=0x4067fc7f "Monitor") at pbx.c:8403
> > #1 0x4067f8d0 in unload_module () at res_monitor.c:939
> > #2 0x08119186 in ast_module_shutdown () at loader.c:519
> > #3 0x0807f726 in really_quit (num=0, niceness=1099017792, restart=1) at asterisk.c:1774
> > #4 0x0807fc05 in handle_restart_gracefully (e=0x4181ae40, cmd=1099007104, a=0x41818480) at asterisk.c:1692
> > #5 0x080cb91c in ast_cli_command_full (uid=0, gid=0, fd=50, s=0xba3ff3c4 "core restart gracefully") at cli.c:2560
> > #6 0x080cbac9 in ast_cli_command_multiple_full (uid=0, gid=0, fd=50, size=24, s=0xba3ff863 "core restart gracefully")
> > at cli.c:2583
> > #7 0x0807d6aa in netconsole (vconsole=0x82276a0) at asterisk.c:1336
> > #8 0x0819580b in dummy_start (data=0x4181ae40) at utils.c:1028
> > #9 0x40515d92 in pthread_detach () from /lib/libpthread.so.0
> > #10 0x400e6e4a in clone () from /lib/libc.so.6
> >
> > this is here (comments by me):
> > static void unreference_cached_app(struct ast_app *app)
> > {
> > struct ast_context *context = NULL;
> > struct ast_exten *eroot = NULL, *e = NULL;
> >
> > ast_rdlock_contexts(); // take a read-lock...
> > while ((context = ast_walk_contexts(context))) {
> > while ((eroot = ast_walk_context_extensions(context, eroot))) {
> > while ((e = ast_walk_extension_priorities(eroot, e))) {
> > //line 8304----> if (e->cached_app == app)
> > e->cached_app = NULL; // ...and then write???
> > }
> > }
> > }
> > ast_unlock_contexts();
> >
> > return;
> > }
> >
> > There is write operations under a read-lock?
> >
>
> rmudgett wrote:
> If you look at the ast_rdlock_contexts() you will see that the contexts lock is not a rd/wr lock but actually a mutex. It has not been a rd/wr lock for quite some time. So something else is going on.
There is a mistake in that patch. However, even after correcting that I'm seeing additional problems via valgrind, even though it's not crashing for me. I will submit another patch after I've resolved these issues.
- Scott
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/2930/#review9980
-----------------------------------------------------------
On Oct. 21, 2013, 4:18 p.m., Scott Griepentrog wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/2930/
> -----------------------------------------------------------
>
> (Updated Oct. 21, 2013, 4:18 p.m.)
>
>
> Review request for Asterisk Developers and Guenther Kelleter.
>
>
> Bugs: AST-1235
> https://issues.asterisk.org/jira/browse/AST-1235
>
>
> Repository: Asterisk
>
>
> Description
> -------
>
> This patch fixes a bug report where restarting Asterisk repeatedly would cause a crash due to accidentally deleting extensions that were still referenced in the context's root_table. This occurred due to confusion over matching or not matching extensions based on callerid, which is also being cleared up via an enum'd third state.
>
> There are two values in an extension entry: matchcid which says whether cid should be matched, and cidmatch which is the string value to match. The matchcid value was previously non-zero when cidmatch was signficant. This is important since a null length callerid match must be kept distinct from a match-anything condition.
>
> This patch adds an enum to define the AST_EXT_MATCHCID_OFF (0) or AST_EXT_MATCHCID_ON (1) behaviors, and adds AST_EXT_MATCHCID_ANY (2) which allows matching to both exten->matchcid=0 and exten->matchcid=1 cases. For example, all exten's with the same exten (number) can be deleted with ANY, but when looking for a dialplan match it must match to matchcid, and if set to ON, further match cidmatch.
>
> The logic to match extensions has been reworked to implement these three types, and then matchcid values updated to use the enum for clarity.
>
> The most significant change is in ast_context_remove_extension_callerid2() where after removing extensions from the hashtable, the deletion would destroy the extensions. This was deleting exten's that didn't have matchcid set but matched a null cidmatch, where they hadn't been removed from the hashtable.
>
> - (!callerid || !matchcallerid || (matchcallerid && !strcmp(peer->cidmatch, callerid))) &&
> + (!callerid || (!matchcallerid && !peer->matchcid) || (matchcallerid && peer->matchcid && !strcmp(peer->cidmatch, callerid))) &&
>
>
> Diffs
> -----
>
> /branches/1.8/include/asterisk/pbx.h 400531
> /branches/1.8/main/pbx.c 400531
>
> Diff: https://reviewboard.asterisk.org/r/2930/diff/
>
>
> Testing
> -------
>
> This patch has been tested with the testsuite additions dialplan_reload and pbx/callerid_match (posted separately) to insure that the crash case has been resolved without any other unwanted changes in behavior.
>
>
> Thanks,
>
> Scott Griepentrog
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20131023/720eaae1/attachment-0001.html>
More information about the asterisk-dev
mailing list