[asterisk-bugs] [JIRA] (ASTERISK-27192) res_pjsip: Loss of SIP registrations causing unavailable endpoints

Scott Pabin (JIRA) noreply at issues.asterisk.org
Thu Aug 31 04:08:08 CDT 2017


    [ https://issues.asterisk.org/jira/browse/ASTERISK-27192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=238341#comment-238341 ] 

Scott Pabin commented on ASTERISK-27192:
----------------------------------------

I may have read through the IRC too quickly, but it looked liked it was proposed to simply remove the oldest *contact* to make room for the registering device.  This would obviously cause a loop of registrations when *max_contacts* = x and NumOfDevices > x, if the devices are set to reregister automatically.  Such a loop happens now when *remove_existing* = yes is used with *max_contacts* > 1, as it seems (without looking the code) that *remove_existing* removes +all+ contacts for that AOR when a new device attempts to register against that AOR.

The 'bump off the oldest pollable (responding) contact' when a new device is attempting to register in excess of *max_contacts* seems to be the less desirable option as the device should instead be denied with a *403 Forbidden*.  There are scenarios where bumping off the oldest contact (not in a call) is desirable, and that is where a configuration item named *remove_existing* would make sense by allowing the latest registration attempt to always register successfully.

Perhaps it would be advantageous to poll existing contacts +when+ a device attempts to register in excess of *max_contacts*, instead of waiting for them to time out. You can call it *auto_purge* and make it the default.  With this solution, Asterisk can respond within a new device's SIP registration dialog with either a *200 OK* (when there is space) or *403 Forbidden* (when it's full).  A request in excess of *max_contacts* should continue to throw a warning, and it should trigger an event if it does not already.  In this scenario, *rewrite_contact* would work just fine, if I'm understanding its mechanism.  The polling option also allows for the quickest device reregistration when ports have changed, such as is possible after a client-side router restart.  The timeout method results in tech-support calls, as it's not an _instantaneous_ recovery.

> res_pjsip: Loss of SIP registrations causing unavailable endpoints
> ------------------------------------------------------------------
>
>                 Key: ASTERISK-27192
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-27192
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Resources/res_pjsip_registrar
>    Affects Versions: 13.17.0
>            Reporter: Richard Mudgett
>
> When {{rewrite_contact}} is enabled, the {{max_contacts}} count can block re-registrations because the source port from the endpoint can be random.  When the re-registration is blocked, the endpoint may give up re-registering and require manual intervention.  While working on ASTERISK-27147 this issue was identified and discussed on IRC.
> {quote}
> (11:41:44 AM) rmudgett: file: gtjoseph  So...  What is the benefit to having a set number of max_contacts to pjsip AORs?  In light of rewrite_contact this can get in the way if you re-register after a restart.
> (11:42:55 AM) gtjoseph: not sure actually other than limit 1
> (11:43:39 AM) rmudgett: Setting to one actually makes the issue with rewrite_contact more likely.
> (11:43:43 AM) gtjoseph: is the issue having a limit of having more than 1?
> (11:44:06 AM) file: the issue is that if it's not a refresh then the REGISTER would be rejected
> (11:44:14 AM) file: until a contact expires
> (11:44:38 AM) rmudgett: Yes.  And if the register is rejected, the endpoint may give up trying to register.
> (11:45:51 AM) gtjoseph: well, we also have "remove_existing"
> (11:45:53 AM) file: I think having remove_existing work in that case smarter would be the option, it would still limit to a max but allow registrations
> (11:46:07 AM) rmudgett: There is also the chan_sip compatibility option of remove_existing in the mix.
> (11:46:55 AM) rmudgett: file: How could remove_existing work smarter?
> (11:47:06 AM) file: what does it do right now in that scenario?
> (11:49:46 AM) gtjoseph: are we thinking that multiple contacts will be an issue or that having the set limit will be?
> (11:50:06 AM) file: the set limit is the problem
> (11:50:08 AM) rmudgett: The set limit is the problem.
> (11:50:55 AM) file: I think remove_existing should remove the number of contacts required to allow the new REGISTER to succeed
> (11:51:31 AM) gtjoseph: but which ones?  Oldest?
> (11:51:43 AM) file: the ones that will expire naturally soonest
> (11:51:51 AM) gtjoseph: yeah makes sense
> (11:52:30 AM) rmudgett: Remove existing removes all contacts that were not just registered.  After it ignores any currently in the contacts container
> (11:52:53 AM) rmudgett: remove_existing won't let you add more than max_contacts
> (11:53:13 AM) file: I know, I'm saying I think it is reasonable for the behavior to change to the above
> (11:53:21 AM) file: which solves the problem
> (11:54:35 AM) rmudgett: I'm just wondering why have a set limit.  Either max_contacts=0 to disable registrations or max_contacts=infinity.
> (11:55:42 AM) gtjoseph: not sure i can think of an exact use case immediately but we really can't change it now
> (11:55:45 AM) rmudgett: Is there some benefit to setting max_contacts=3 for instance?
> (11:56:31 AM) rmudgett: It could be documented that max_contacts is not a security feature and is not recommended to set it to a low value.
> (11:56:31 AM) file: outright benefit? no - but it can limit exposure
> (11:56:49 AM) file: specifying a recommendation is fine
> (11:56:56 AM) file: but as gtjoseph said - it's not getting removed
> (11:57:35 AM) gtjoseph: it can also be used by a service provider to only allow a certain number of servers to register to an endpoint.
> (11:58:56 AM) rmudgett: Which said service provider is likely to need rewrite_contact enabled and cause the problem of hitting the limit because of the random remote port number being registered.
> (11:59:18 AM) file: which is why I said I think it's reasonable for remove_existing to behave as I stated, which solves the problem
> (11:59:31 AM) gtjoseph: exactly
> (12:01:49 PM) rmudgett: That changes remove_existing behavior from chan_sip behavior as it is documented to enable.
> (12:02:41 PM) file: in a configuration that mirrors chan_sip it would function the same
> (12:02:55 PM) file: the only scenario where it functions differently is if multiple contacts are enabled, which is a chan_pjsip feature
> (12:03:09 PM) file: and it would function in a way which is better for the user
> (12:03:46 PM) file: and if you are enabling multiple contacts then the existing behavior is against that from the first place - as others would be removed defeating it in the first place
> (12:06:58 PM) rmudgett: So.  The new behavior would be:
> (12:06:58 PM) rmudgett: 1) Don't check max_contacts if remove_existing is enabled
> (12:06:58 PM) rmudgett: 2) Process the register contacts.
> (12:06:58 PM) rmudgett: 3) If remove_existing enabled then remove the soonest to expire contacts until there are at most max_contacts.
> (12:06:58 PM) rmudgett: 4) send register response
> (12:07:35 PM) file: if you are referring to new logic, yes, that sounds about right
> (12:12:44 PM) rmudgett: With the new behavior, remove_existing could use a better name.
> (12:42:23 PM) gtjoseph: rmudgett: careful with that logic...   depending on Expires,  times the pass that removes the soonest to expire might remove the one just registered.
> (12:42:40 PM) gtjoseph: s/times//
> (12:42:41 PM) infobot: gtjoseph meant: rmudgett: careful with that logic...   depending on Expires,   the pass that removes the soonest to expire might remove the one just registered.
> (12:43:21 PM) gtjoseph: i think you have to remove until there's an open spot, then register.
> (12:50:43 PM) rmudgett: We certainly cannot add more than the max.  We already keep track of the unmodified contacts so we can remove the soonest to expire unmodified contacts.
> (12:52:43 PM) gtjoseph: ok, so long as it's "unmodified contacts"
> {quote}
> Proposed new algorythm in res_pjsip_registrar.c:register_aor_core() for {{remove_existing}} option behavior:
> {code}
> if remove_existing
>   if MAX(added + updated - deleted, 0) > max_contacts
>      respond forbidden
> else
>   if MAX(added - deleted, 0) + current_contacts_count > max_contacts
>      respond forbidden
> process the REGISTER contacts keeping track of which contacts were not affected by this REGISTER
> if remove_existing
>   while current_contacts_count > max_contacts
>     ast_assert(unmodified contact count > 0)
>     remove the soonest to expire unmodified contact
> respond successful with current contacts
> {code}
> The fix for ASTERISK-27147 will take care of connection oriented transports like TCP and TLS.  However, the UDP transport will still have this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list