[asterisk-dev] The "Busy" App.... isn't.

Matt Fredrickson creslin at digium.com
Wed Apr 4 16:20:56 CDT 2018


On Mon, Apr 2, 2018 at 9:40 PM, Steve Murphy <murf at parsetree.com> wrote:
> Someone complained about errant behavior of the Busy and Hangup apps...
> and I see some strangenesses that might make them credible.
>
> Boy, did I have to chase this around to find what I think is the problem!
>
> What happens: The (eg) Busy app is called. Instead of it waiting a generous
> 15 seconds,
> it immediately hangs up, not forwarding anything to the trunk the call came
> in from. This
> is the closing act in the blocking of a call, executed from the dialplan.
>
> With a generous amount of debug added to help explain the order of
> execution, I see
> this:
>
> [02-Apr-2018 23:31:25.313] VERBOSE[21813][C-00000005] pbx.c: Executing
> [s at Busy:1] Busy("SIP/sip-out-5-00000002", "15") in new stack
> [02-Apr-2018 23:31:25.313] VERBOSE[21813][C-00000005] pbx.c: in
> pbx_builtin_busy, about to indicate, data is 15
> [02-Apr-2018 23:31:25.314] VERBOSE[21813][C-00000005] pbx.c: returned from
> ast_indicate
> [02-Apr-2018 23:31:25.314] VERBOSE[21813][C-00000005] pbx.c: got lock
> [02-Apr-2018 23:31:25.314] VERBOSE[21813][C-00000005] pbx.c: channel not up,
> setting hangup cause
> [02-Apr-2018 23:31:25.314] VERBOSE[21813][C-00000005] pbx.c: setting channel
> to busy
> [02-Apr-2018 23:31:25.314] VERBOSE[21813][C-00000005] pbx.c: unlocked,
> waiting for handup
> [02-Apr-2018 23:31:25.314] VERBOSE[21813][C-00000005] pbx.c: in
> wait_for_hangup, data is 15
> [02-Apr-2018 23:31:25.314] VERBOSE[21813][C-00000005] pbx.c:
> wait_for_hangup, waitsec is: 15
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] pbx.c:
> wait_for_hangup, waittime is: 15000
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep() calling ast_safe_sleep_conditional()
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: entered, timeout_ms: 15000
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: in while();
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: made it past the cond test, about to call
> waitfor;
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] channel.c: ast_waitfor
> calling ast_waitfor_nandfds()
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds();n=1, nfds=0
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); past the first for() statement-- wait full
> interval
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); build the pollfd array. rms=15000,
> AST_MAX_FDS=11
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); Add the individual fds nfds=0
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); start= 67108865
> [02-Apr-2018 23:31:25.315] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); Linux fix
> [02-Apr-2018 23:31:25.316] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); ast_poll returns 1; max=4, kbrms=15000
> [02-Apr-2018 23:31:25.316] VERBOSE[21813][C-00000005] channel.c:
> pfds[0].fd = 25;  .events=3;   .revents=0
> [02-Apr-2018 23:31:25.316] VERBOSE[21813][C-00000005] channel.c:
> pfds[1].fd = 26;  .events=3;   .revents=0
> [02-Apr-2018 23:31:25.316] VERBOSE[21813][C-00000005] channel.c:
> pfds[2].fd = 30;  .events=3;   .revents=0
> [02-Apr-2018 23:31:25.316] VERBOSE[21813][C-00000005] channel.c:
> pfds[3].fd = 31;  .events=3;   .revents=1
> [02-Apr-2018 23:31:25.316] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); cleared flags rms=15000, res=1
> [02-Apr-2018 23:31:25.316] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); check for pending events max=4
> [02-Apr-2018 23:31:25.316] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); past the pending events for loop, ms=15000
> [02-Apr-2018 23:31:25.316] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); Final return of winner
> [02-Apr-2018 23:31:25.316] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: made it past the cond test, waitfor returns
> 15000
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: made it past ast_waitfor, ms>0
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: ast_read returned
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: ast_read returned a frame
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: the frame is NOT deferrable
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: in while();
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: made it past the cond test, about to call
> waitfor;
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c: ast_waitfor
> calling ast_waitfor_nandfds()
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds();n=1, nfds=0
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); past the first for() statement-- wait full
> interval
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); build the pollfd array. rms=14998,
> AST_MAX_FDS=11
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); Add the individual fds nfds=0
> [02-Apr-2018 23:31:25.317] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); start= 67108865
> [02-Apr-2018 23:31:25.318] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); Linux fix
> [02-Apr-2018 23:31:25.318] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); ast_poll returns 1; max=4, kbrms=14998
> [02-Apr-2018 23:31:25.318] VERBOSE[21813][C-00000005] channel.c:
> pfds[0].fd = 25;  .events=3;   .revents=0
> [02-Apr-2018 23:31:25.318] VERBOSE[21813][C-00000005] channel.c:
> pfds[1].fd = 26;  .events=3;   .revents=0
> [02-Apr-2018 23:31:25.318] VERBOSE[21813][C-00000005] channel.c:
> pfds[2].fd = 30;  .events=3;   .revents=0
> [02-Apr-2018 23:31:25.318] VERBOSE[21813][C-00000005] channel.c:
> pfds[3].fd = 31;  .events=3;   .revents=1
> [02-Apr-2018 23:31:25.318] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); cleared flags rms=14998, res=1
> [02-Apr-2018 23:31:25.318] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); check for pending events max=4
> [02-Apr-2018 23:31:25.318] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); past the pending events for loop, ms=14998
> [02-Apr-2018 23:31:25.318] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); Final return of winner
> [02-Apr-2018 23:31:25.318] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: made it past the cond test, waitfor returns
> 14998
> [02-Apr-2018 23:31:25.319] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: made it past ast_waitfor, ms>0
> [02-Apr-2018 23:31:25.319] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: ast_read returned
> [02-Apr-2018 23:31:25.319] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: ast_read returned NULL, breaking out
> [02-Apr-2018 23:31:25.319] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: in while();
> [02-Apr-2018 23:31:25.319] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: made it past the cond test, about to call
> waitfor;
> [02-Apr-2018 23:31:25.319] VERBOSE[21813][C-00000005] channel.c: ast_waitfor
> calling ast_waitfor_nandfds()
> [02-Apr-2018 23:31:25.319] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds();n=1, nfds=0
> [02-Apr-2018 23:31:25.319] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); past the first for() statement-- wait full
> interval
> [02-Apr-2018 23:31:25.319] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); build the pollfd array. rms=14996,
> AST_MAX_FDS=11
> [02-Apr-2018 23:31:25.319] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); Add the individual fds nfds=0
> [02-Apr-2018 23:31:25.320] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); start= 67108865
> [02-Apr-2018 23:31:25.320] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); Linux fix
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); ast_poll returns 0; max=4, kbrms=14996
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c:
> pfds[0].fd = 25;  .events=3;   .revents=0
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c:
> pfds[1].fd = 26;  .events=3;   .revents=0
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c:
> pfds[2].fd = 30;  .events=3;   .revents=0
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c:
> pfds[3].fd = 31;  .events=3;   .revents=0
> :
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); cleared flags rms=0, res=0
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c: In
> non-EPOLL ast_waitfor_nandfds(); returning winner early
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: made it past the cond test, waitfor returns 0
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: beyond the while()
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: about to lock channel
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: packets queued = 0
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] channel.c:
> ast_safe_sleep_conditional: channel unlocked
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] pbx.c:
> wait_for_hangup, waited 15000 msec
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] pbx.c:
> wait_for_hangup, returning
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] pbx.c: returned from
> wait for hangup
> [02-Apr-2018 23:31:40.328] VERBOSE[21813][C-00000005] pbx.c: Spawn extension
> (Busy, s, 1) exited non-zero on 'SIP/sip-out-5-00000002'
>
> So, for some odd reason, the ast_poll call in the ast_waitfor_nandfds
> routine reports twice that something is waiting in one of the 4 fd's
> involved. The first one gets a frame from ast_read(),
> but the second gets no frame. (Maybe it was partial??)...
>
> At any rate, the code in ast_safe_sleep_conditional() says this:
>
>                 ms = ast_waitfor(chan, ms);
>                 ast_verb(3,"ast_safe_sleep_conditional: made it past the
> cond test, waitfor returns %d \n", ms);
>                 if (ms < 0) {
>                         res = -1;
>                         break;
>                 }
>                 if (ms > 0) {
>                         ast_verb(3,"ast_safe_sleep_conditional: made it past
> ast_waitfor, ms>0\n");
>                         f = ast_read(chan);
>                         ast_verb(3,"ast_safe_sleep_conditional: ast_read
> returned\n");
>                         if (!f) {
>                                 ast_verb(3,"ast_safe_sleep_conditional:
> ast_read returned NULL, breaking out\n");
>                                 // res = -1;
>                                 // break;
>                                 continue; // we still need to finish out the
> wait!!!!!!!
>                         }
>                         ast_verb(3,"ast_safe_sleep_conditional: ast_read
> returned a frame\n");
>
>                         if (!ast_is_deferrable_frame(f)) {
>                                 ast_verb(3,"ast_safe_sleep_conditional: the
> frame is NOT deferrable\n");
>                                 ast_frfree(f);
>                                 continue;
>                         }
>                         ast_verb(3,"ast_safe_sleep_conditional: the frame is
> deferrable\n");
>
>                         if ((dup_f = ast_frisolate(f))) {
>                                 if (dup_f != f) {
>                                         ast_frfree(f);
>                                 }
>                                 ast_verb(3,"ast_safe_sleep_conditional:
> inserting the frame in list\n");
>                                 AST_LIST_INSERT_HEAD(&deferred_frames,
> dup_f, frame_list);
>                         }
>                 }
>         }
>
> I commented out the res= -1 and break, and added the continue; instead, as
> you can see above.
>
> [02-Apr-2018 23:33:48.234] VERBOSE[21904][C-00000007] pbx.c: Executing
> [s at Busy:1] Busy("SIP/sip-out-5-00000003", "15") in new stack
> [02-Apr-2018 23:33:48.235] VERBOSE[21904][C-00000007] pbx.c: in
> pbx_builtin_busy, about to indicate, data is 15
> [02-Apr-2018 23:33:48.235] VERBOSE[21904][C-00000007] chan_sip.c:
> <--- Reliably Transmitting (no NAT) to 66.23.129.253:5060 --->
> SIP/2.0 486 Busy Here
> Via: SIP/2.0/UDP
> 66.23.129.253:5060;branch=z9hG4bK7caf.48d7ef727e3bb0c87cbb158c3ef18e63.0;received=66.23.129.253
> Via: SIP/2.0/UDP
> 66.23.129.250:5060;received=10.153.1.250;branch=z9hG4bK7caf.de6aa0c534eb22921f9e6f85fa5eb040.0
> Via: SIP/2.0/UDP
> 209.193.79.30;branch=z9hG4bK7caf.7a633e1044f70f55d71ce7f568ee5fe0.1
> Via: SIP/2.0/UDP 67.231.5.112:5060;branch=z9hG4bK00B3ce218f4d0c0d1ca
> From: "MURPHY STEVE " <sip:1307xxxxxxx at 67.231.5.112>;tag=gK0069f789
> To: <sip:15713854677 at 209.193.79.30>;tag=as241b593f
> Call-ID: 828413536_117349475 at 67.231.5.112
> CSeq: 423701 INVITE
> Server: nexVortex Inc Hosted 3.0 PBX
> Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY, INFO,
> PUBLISH, MESSAGE
> Supported: replaces, timer
> Content-Length: 0
>
>
> <------------>
> [02-Apr-2018 23:33:48.235] VERBOSE[21904][C-00000007] pbx.c: returned from
> ast_indicate
> [02-Apr-2018 23:33:48.236] VERBOSE[21904][C-00000007] pbx.c: got lock
> [02-Apr-2018 23:33:48.236] VERBOSE[21904][C-00000007] pbx.c: channel not up,
> setting hangup cause
> [02-Apr-2018 23:33:48.236] VERBOSE[21904][C-00000007] pbx.c: setting channel
> to busy
> [02-Apr-2018 23:33:48.236] VERBOSE[21904][C-00000007] pbx.c: unlocked,
> waiting for handup
> [02-Apr-2018 23:33:48.236] VERBOSE[21904][C-00000007] pbx.c: in
> wait_for_hangup, data is 15
> [02-Apr-2018 23:33:48.237] VERBOSE[21904][C-00000007] pbx.c:
> wait_for_hangup, waitsec is: 15
> [02-Apr-2018 23:33:48.237] VERBOSE[21904][C-00000007] pbx.c:
> wait_for_hangup, waittime is: 15000
> [02-Apr-2018 23:33:48.237] VERBOSE[21904][C-00000007] channel.c:
> ast_safe_sleep() calling ast_safe_sleep_conditional()
> [02-Apr-2018 23:33:48.237] VERBOSE[21904][C-00000007] channel.c:
> ast_safe_sleep_conditional: entered, timeout_ms: 15000
> [02-Apr-2018 23:33:48.237] VERBOSE[21904][C-00000007] channel.c:
> ast_safe_sleep_conditional: in while();
> [02-Apr-2018 23:33:48.237] VERBOSE[21904][C-00000007] channel.c:
> ast_safe_sleep_conditional: made it past the cond test, about to call
> waitfor;
> [02-Apr-2018 23:33:48.237] VERBOSE[21904][C-00000007] channel.c: ast_waitfor
> calling ast_waitfor_nandfds()
> ....
>
> I cut off the rest of the debug, but the call lasts the full 15  seconds
> after busy() is called, and then hangs up. It's interesting to see what the
> trunk provider does on top of all this.
>
> So, my question is: Did I do the best thing? It seems to work, but I have no
> idea if I'm creating bugs galore in other usages.

There is an assumption (for asterisk C level applications, like
Busy()) when using ast_read that if it ever returns NULL that the
underlying channel has been hung up and the application should quickly
exit.  Busy() is honoring that assumption.  It looks like in
sip_indicate() of chan_sip.c, when a AST_CONTROL_BUSY is queued on a
SIP channel, it sets the soft hangup flag on the channel.  That's why
you're seeing this behavior.  The timeout from Busy() is invalidated
if the underlying channel wants to be hung up.

I would be highly concerned about breaking that assumption, as you are
doing in your proposed patch.  Is there a particular reason you want
to keep the call up for 15 seconds before releasing the channel?

-- 
Matthew Fredrickson
Digium, Inc. | Engineering Manager
445 Jan Davis Drive NW - Huntsville, AL 35806 - USA



More information about the asterisk-dev mailing list