[asterisk-bugs] [JIRA] (ASTERISK-24374) "sip qualify peer" CLI command stops periodic pokes for the peer forever, if the peer is unreachable

Michele Cicciotti (PrivateWave SpA) (JIRA) noreply at issues.asterisk.org
Tue Sep 30 03:41:28 CDT 2014


Michele Cicciotti (PrivateWave SpA) created ASTERISK-24374:
--------------------------------------------------------------

             Summary: "sip qualify peer" CLI command stops periodic pokes for the peer forever, if the peer is unreachable
                 Key: ASTERISK-24374
                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-24374
             Project: Asterisk
          Issue Type: Bug
      Security Level: None
          Components: Channels/chan_sip/General
    Affects Versions: 1.8.31.0
            Reporter: Michele Cicciotti (PrivateWave SpA)
            Severity: Minor


Periodic pokes for a realtime peer are started by the first call to {{sip_poke_peer}}. {{sip_poke_peer}} cancels the current {{pokeexpire}} timer, and transmits an {{OPTION}} request to the peer. If the transmission fails, {{sip_poke_noanswer}} is called immediately; otherwise, the {{pokeexpire}} timer is set to call {{sip_poke_noanswer}} after {{maxms * 2}} milliseconds (the maximum allowed roundtrip time)

If the peer answers to the {{OPTION}} request, {{handle_response_peerpoke}} is called, which updates peer reachability and re-schedules the pokeexpire timer to call {{sip_poke_peer_s}} in {{qualifyfreq}} milliseconds (if the peer is reachable) or 10 seconds

If the peer fails to answer, {{sip_poke_noanswer}} is called by the scheduler. {{sip_poke_noanswer}} sets the peer as unreachable, and schedules the {{pokeexpire}} timer to call {{sip_poke_peer_s}} in 10 seconds

{{sip_poke_peer_s}} is little more than a wrapper to {{sip_poke_peer}}

To sum up, the periodic poke loop goes like: {{sip_poke_peer}} → {{sip_poke_noanswer}}/{{handle_response_peerpoke}} → {{sip_poke_peer_s}} → repeat

However, when {{sip_poke_peer}} is called by CLI/manager command {{sip qualify peer}}, it won't schedule a call to {{sip_poke_noanswer}}:

{code}
	} else if (!force) {
		AST_SCHED_REPLACE_UNREF(peer->pokeexpire, sched, peer->maxms * 2, sip_poke_noanswer, peer,
				unref_peer(_data, "removing poke peer ref"),
				unref_peer(peer, "removing poke peer ref"),
				ref_peer(peer, "adding poke peer ref"));
	}
{code}

If the peer is unreachable, {{handle_response_peerpoke}} will never be called. This deadlock is supposed to be broken by {{sip_poke_noanswer}}, but it's never scheduled (and not all network errors can be detected synchronously by {{transmit_invite}}, so {{sip_poke_noanswer}} may never be called directly, either). Nobody is left to schedule a call to {{sip_poke_peer_s}}: the periodic poke state machine is dead and there is no way to restart it, except by pruning the peer from the realtime table

The easiest way to fix the issue is probably to change the above code into:

{code}
	} else {
		AST_SCHED_REPLACE_UNREF(peer->pokeexpire, sched, peer->maxms * 2, sip_poke_noanswer, peer,
				unref_peer(_data, "removing poke peer ref"),
				unref_peer(peer, "removing poke peer ref"),
				ref_peer(peer, "adding poke peer ref"));
	}
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list