[asterisk-bugs] [JIRA] (ASTERISK-24774) Segfault in ast_context_destroy with extensions.ael and extensions.conf

Matt Jordan (JIRA) noreply at issues.asterisk.org
Sun Apr 19 20:55:33 CDT 2015


    [ https://issues.asterisk.org/jira/browse/ASTERISK-24774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=225936#comment-225936 ] 

Matt Jordan edited comment on ASTERISK-24774 at 4/19/15 8:54 PM:
-----------------------------------------------------------------

What a freaking mess. The handling of extensions in the PBX core is sometimes so screwy it boggles the mind. Thank god we have code reviews today.

Here is what is most likely happening.

# {{pbx_config}} and {{pbx_ael}} are unloaded prior to {{res_agi}}. As a result, they are unable to fully unload their extensions, as an application is still using it:
{code}
[Apr 19 15:24:59] VERBOSE[1269] pbx.c:     -- Remove test/agi1/1, registrar=pbx_config; con=<nil>((nil)); con->root=(nil)
[Apr 19 15:24:59] ERROR[1269] pbx.c: Did not remove this exten (agi1) from the context root_table (test) (priority 1)
{code}
# However, we now have a problem. Throwing a bit more debug into the process, we can see that {{end_traversal}} returns 0, which causes the priority object to get destroyed. The comment above this explains why:
{code}
[Apr 19 15:24:59] VERBOSE[1269] pbx.c:     -- Remove test/agi1/1, registrar=pbx_config; con=<nil>((nil)); con->root=(nil)
[Apr 19 15:24:59] ERROR[1269] pbx.c: Destroy 0x175f220
[Apr 19 15:24:59] ERROR[1269] pbx.c: End traversal is 0
[Apr 19 15:24:59] WARNING[1269] pbx.c: Freeing prio_iter on  0x175f220, 0x1764740
...
0910                                         /* Explanation:
10911                                          * ast_context_remove_extension_callerid2 will destroy the extension that it comes across. This
10912                                          * destruction includes destroying the exten's peer_table, which we are currently traversing. If
10913                                          * ast_context_remove_extension_callerid2 ever should return '0' then this means we have destroyed
10914                                          * the hashtable which we are traversing, and thus calling ast_hashtab_end_traversal will result
10915                                          * in reading invalid memory. Thus, if we detect that we destroyed the hashtable, then we will simply
10916                                          * free the iterator
10917                                          */
10918                                         if (end_traversal) {
10919                                                 ast_hashtab_end_traversal(prio_iter);
10920                                         } else {
10921                                                 ast_log(LOG_WARNING, "Freeing prio_iter on %s %p, %p\n",
10922                                                                 exten_item->exten,
10923                                                                 exten_item, exten_item->peer_table);
10924                                                 ast_free(prio_iter);
10925                                         }
{code}
That is, *we destroyed peer_table* while iterating over it. However, we also failed to remove the extension from the context's {{root_table}}, as AGI was still hanging around.
# Following that, AGI unloads:
{code}
[Apr 19 15:24:59] VERBOSE[1269] res_agi.c:   == AGI Command 'answer' unregistered
[Apr 19 15:24:59] VERBOSE[1269] res_agi.c:   == AGI Command 'asyncagi break' unregistered
[Apr 19 15:24:59] VERBOSE[1269] res_agi.c:   == AGI Command 'channel status' unregistered
[Apr 19 15:24:59] VERBOSE[1269] res_agi.c:   == AGI Command 'database del' unregistered
[Apr 19 15:24:59] VERBOSE[1269] res_agi.c:   == AGI Command 'database deltree' unregistered
{code}
# And then core {{features}} unloads. However, because we still have the AGI-dependent extensions lurking in the {{root_table}}, we try removing and destroying their {{peer_table}} as well:
{code}
[Apr 19 15:24:59] VERBOSE[1269] pbx.c:     -- Remove parkedcalls/700/1, registrar=features; con=<nil>((nil)); con->root=(nil)
[Apr 19 15:24:59] ERROR[1269] pbx.c: Destroy 0x1761b90
[Apr 19 15:24:59] ERROR[1269] pbx.c: End traversal is 0
[Apr 19 15:24:59] WARNING[1269] pbx.c: Freeing prio_iter on  0x1761b90, 0x1761db0
[Apr 19 15:24:59] ERROR[1269] pbx.c: Starting traversal on agi1 0x1764de0, (nil)
{code}
The important bit in the above is that we are ostensibly only wanting to destroy context/extension/priority tuples registered by {{features}}, but our lurking {{agi1}} extension - which has already had its {{peer_table}} destroyed - it still waiting.
# At which point, we pass a NULL pointer into {{ast_hashtab_start_traversal}}, and blow up:
{code}
10894                                         prio_iter = ast_hashtab_start_traversal(exten_item->peer_table);
{code}

The solution here probably is as simple as simply checking that someone else didn't destroy {{peer_table}} and leave the extension floating around.




was (Author: mjordan):
What a freaking mess. The handling of extensions in the PBX core is sometimes so screwy it boggles the mind. Thank god we have code reviews today.

Here is what is most likely happening.

# {{pbx_config}} and {{pbx_ael}} are unloaded prior to {{res_agi}}. As a result, they are unable to fully unload their extensions, as an application is still using it:
{code}
[Apr 19 15:24:59] VERBOSE[1269] pbx.c:     -- Remove test/agi1/1, registrar=pbx_config; con=<nil>((nil)); con->root=(nil)
[Apr 19 15:24:59] ERROR[1269] pbx.c: Did not remove this exten (agi1) from the context root_table (test) (priority 1)
{code}
# However, we now have a problem. Throwing a bit more debug into the process, we can see that {{end_traversal}} returns 0, which causes the priority object to get destroyed. The comment above this explains why:
{code}
[Apr 19 15:24:59] VERBOSE[1269] pbx.c:     -- Remove test/agi1/1, registrar=pbx_config; con=<nil>((nil)); con->root=(nil)
[Apr 19 15:24:59] ERROR[1269] pbx.c: Destroy 0x175f220
[Apr 19 15:24:59] ERROR[1269] pbx.c: End traversal is 0
[Apr 19 15:24:59] WARNING[1269] pbx.c: Freeing prio_iter on  0x175f220, 0x1764740
...
0910                                         /* Explanation:
10911                                          * ast_context_remove_extension_callerid2 will destroy the extension that it comes across. This
10912                                          * destruction includes destroying the exten's peer_table, which we are currently traversing. If
10913                                          * ast_context_remove_extension_callerid2 ever should return '0' then this means we have destroyed
10914                                          * the hashtable which we are traversing, and thus calling ast_hashtab_end_traversal will result
10915                                          * in reading invalid memory. Thus, if we detect that we destroyed the hashtable, then we will simply
10916                                          * free the iterator
10917                                          */
10918                                         if (end_traversal) {
10919                                                 ast_hashtab_end_traversal(prio_iter);
10920                                         } else {
10921                                                 ast_log(LOG_WARNING, "Freeing prio_iter on %s %p, %p\n",
10922                                                                 exten_item->exten,
10923                                                                 exten_item, exten_item->peer_table);
10924                                                 ast_free(prio_iter);
10925                                         }
{code}
That is, *we destroyed peer_table* while iterating over it. However, we also failed to remove the extension from the context's {{root_table}}, as AGI was still hanging around.
# Following that, AGI unloads:
{code}
[Apr 19 15:24:59] VERBOSE[1269] res_agi.c:   == AGI Command 'answer' unregistered
[Apr 19 15:24:59] VERBOSE[1269] res_agi.c:   == AGI Command 'asyncagi break' unregistered
[Apr 19 15:24:59] VERBOSE[1269] res_agi.c:   == AGI Command 'channel status' unregistered
[Apr 19 15:24:59] VERBOSE[1269] res_agi.c:   == AGI Command 'database del' unregistered
[Apr 19 15:24:59] VERBOSE[1269] res_agi.c:   == AGI Command 'database deltree' unregistered
{code}
# And then we core {{features}} unloads. However, because we still have the AGI-dependent extensions lurking in the {{root_table}}, we try removing and destroying their {{peer_table}} as well:
{code}
[Apr 19 15:24:59] VERBOSE[1269] pbx.c:     -- Remove parkedcalls/700/1, registrar=features; con=<nil>((nil)); con->root=(nil)
[Apr 19 15:24:59] ERROR[1269] pbx.c: Destroy 0x1761b90
[Apr 19 15:24:59] ERROR[1269] pbx.c: End traversal is 0
[Apr 19 15:24:59] WARNING[1269] pbx.c: Freeing prio_iter on  0x1761b90, 0x1761db0
[Apr 19 15:24:59] ERROR[1269] pbx.c: Starting traversal on agi1 0x1764de0, (nil)
{code}
The important bit in the above is that we are ostensibly only wanting to destroy context/extension/priority tuples registered by {{features}}, but our lurking {{agi1}} extension - which has already had its {{peer_table}} destroyed - it still waiting.
# At which point, we pass a NULL pointer into {{ast_hashtab_start_traversal}}, and blow up:
{code}
10894                                         prio_iter = ast_hashtab_start_traversal(exten_item->peer_table);
{code}

The solution here probably is as simple as simply checking that someone else didn't destroy {{peer_table}} and leave the extension floating around.



> Segfault in ast_context_destroy with extensions.ael and extensions.conf
> -----------------------------------------------------------------------
>
>                 Key: ASTERISK-24774
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-24774
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Core/PBX
>    Affects Versions: 11.16.0
>            Reporter: Corey Farrell
>         Attachments: backtrace_11054.txt, backtrace_noload-pbx_lua.txt, extensions.conf, testsuite-pbx-callerid_match.patch
>
>
> While attempting to resolve open channels in testsuite/tests/pbx/callerid_match I am experiencing a segfault every time.  I do not know AGI enough to understand why, but running 'agi.finish()' on the calls in this test seems to cause a segfault on shutdown (somehow contexts become corrupted).



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list