[asterisk-users] Issue with PJSIP contacts being "unavailable"

asterisk at phreaknet.org asterisk at phreaknet.org
Tue Jun 27 06:21:52 CDT 2023


I've been having a serious issue the past couple weeks where many users' 
devices show up as "Unavailable" according to PJSIP. The underlying 
issue is that res_pjsip thinks there are no available contacts for the 
device, and in the normal course of operation, even as it's chatting 
back and forth with the device, it thinks the contact should stay 
"Unavailable". This is particularly problematic because I rely on 
accurate device state, and consequently users are not receiving calls.

The issue seems to be a combination of chan_pjsip + particular device. 
Users with a line on a system using chan_sip do not have any issues, and 
everything works fine (i.e. one line is working fine, the other is 
broken in this manner). Likewise, there are other devices that seem to 
work fine and don't have this issue. Consequently, it seems like this 
should be fixable by changing either something on the ATAs *or* 
something on the Asterisk side (and maybe both, for good measure). All 
units are provisioned more or less identically, and I've ruled out 
firmware version as being a factor in this particular case (here, I'm 
focused on Grandstream HT 802s in particular, since that's a majority of 
both devices in the field, and devices with problems right now).

Capturing some debug logs, I thought this might be related to not 
receiving OPTIONS responses from the endpoint, though it doesn't seem to 
be consistent. At times I see that I'm receiving a response, though in 
the example below, I'm not. However, it's consistently unavailable. I 
looked into this about a year ago when I noticed this issue (though at 
the time, it wasn't impacting many devices) and came to roughly the same 
conclusion. Only now though has this seemed to have a wide impact for a 
prolonged period of time.

I will say that this is an issue that seems to crop up now and then with 
chan_pjsip, and I've been seeing this type of thing occasionally for 
years now, where users won't receive calls, and when I run "pjsip show 
endpoint XXX", it says "Unavailable", no available contacts, even though 
the device is registered. I know that chan_pjsip doesn't use 
registrations at all to determine device state availability (maybe 
chan_sip does and that's why it works more reliably, not entirely 
sure... I'd think that a REGISTER alone ought to be sufficient to toggle 
the device state from "unavailable" to "not in use" or something else - 
at the very least, it would be a failsafe that would lead to device 
state being less buggy, since if a device just registered, it's clearly 
not unavailable). Currently there seem to be OPTIONS every 30 seconds, 
and we have quite a low REGISTER interval as well. No improvement 
changing the OPTIONS/keepalive settings on the ATAs though.

Apart from this being disruptive right now, it's also been a blocker for 
other chan_sip migrations due to the severity of the issue, and I'd 
really like to figure out how it could be resolved or mitigated, so any 
insight would be much appreciated - thanks!

Trace from an "unavailable" ATA (not working correctly): 
https://paste.interlinked.us/iz07sapwrb.txt

Trace from an "available" ATA (working correctly): 
https://paste.interlinked.us/ocutyjslmg.txt




More information about the asterisk-users mailing list