[asterisk-bugs] [JIRA] (ASTERISK-30469) res_pjsip_pubsub: Regression for subscription shutdowns
N A (JIRA)
noreply at issues.asterisk.org
Wed Mar 29 17:02:03 CDT 2023
[ https://issues.asterisk.org/jira/browse/ASTERISK-30469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=261661#comment-261661 ]
N A commented on ASTERISK-30469:
--------------------------------
I just tested it and the correctness issue is resolved.
I don't have MALLOC_DEBUG on that system currently so I can't say anything immediately about leaks, though that wasn't my initial concern.
I'll test this at another site as well that gets more traffic which should expose any correctness issues.
> res_pjsip_pubsub: Regression for subscription shutdowns
> -------------------------------------------------------
>
> Key: ASTERISK-30469
> URL: https://issues.asterisk.org/jira/browse/ASTERISK-30469
> Project: Asterisk
> Issue Type: Bug
> Security Level: None
> Components: Resources/res_pjsip_pubsub
> Affects Versions: 20.2.0
> Environment: pjproject 2.13
> Reporter: N A
> Assignee: N A
> Severity: Major
>
> ASTERISK-30325 caused a regression with subscription shutdowns in res_pjsip_pubsub.c.
> On the upside, it does appear to have fixed a crash issue I was having with PJSIP which I now see was a bug in pjproject the whole time.
> On the downside, when HAVE_PJSIP_EVSUB_PENDING_NOTIFY is defined, clean_sub_tree is *NEVER* called (as opposed to called in a different place as the code allegedly says it should be), so the refactoring of this code seems to be wrong or incomplete. This is confirmed by adding debugs and code that previously ran before this patch to notify a pubsub module of a subscription module is no longer invoked at all (see below). This can cause all kinds of problems in pubsub modules since they never learn a subscription no longer exists. In particular, the case I'm testing is when an endpoint sends a SUBSCRIBE with an Expires of 0 to cancel the subscription (not sure if that matters). Reverting to a version of res_pjsip_pubsub prior to this change makes things work properly.
> This comment that was added recently seems suspect to me, though it doesn't elaborate specifically on which invocation is being referenced:
> {noformat}
> /* for pjproject <2.13, this cleanup occurs here. For >=2.13, pubsub_on_evsub_state
> is called before pubsub_on_rx_refresh and so must be cleaned there.*/
> {noformat}
> If we look at the trace below, this does not seem to match up. pubsub_on_evsub_state is called before pubsub_on_rx_refresh, but it is also called AFTER it as well. So I can see that this would not be the right place to call clean_sub_tree necessarily, as you wouldn't want it firing twice for TERMINATED events. Nonetheless, it needs to be called somewhere.
> A patch that resolves this issue and restores the correct shutdown behavior is attached, but further scrutiny is probably warranted.
> {noformat}
> [2023-03-15 17:35:32.803] DEBUG[206558]: res_pjsip_pubsub.c:3959 pubsub_on_evsub_state: evsub 0x7f4824060bf8 state TERMINATED event TSX_STATE sub_tree 0x 7f48240601e0 sub_tree state Normal
> [2023-03-15 17:35:32.803] DEBUG[206558]: res_pjsip_pubsub.c:3981 pubsub_on_evsub_state: Called here2
> [2023-03-15 17:35:32.803] DEBUG[206558]: res_pjsip_pubsub.c:4109 pubsub_on_rx_refresh: evsub 0x7f4824060bf8 sub_tree 0x7f48240601e0 sub_tree state Normal
> [2023-03-15 17:35:32.803] DEBUG[206558]: res_pjsip_pubsub.c:4202 pubsub_on_rx_refresh: XXXXXXX else
> [2023-03-15 17:35:32.803] DEBUG[206558]: res_pjsip_pubsub.c:686 subscription_persistence_update: Updating persistence for 'PolycomSCA1->PolycomSCA1' pru ne on boot: no
> [2023-03-15 17:35:32.804] DEBUG[206558]: res_pjsip_pubsub.c:3990 pubsub_on_refresh_timeout: sub_tree 0x7f48240601e0 sub_tree state TerminatePending
> [2023-03-15 17:35:32.804] DEBUG[206558]: res_pjsip_pubsub.c:686 subscription_persistence_update: Updating persistence for 'PolycomSCA1->PolycomSCA1' pru ne on boot: no
> [2023-03-15 17:35:32.846] DEBUG[206558]: res_pjsip_pubsub.c:4235 pubsub_on_rx_refresh: Called here
> [2023-03-15 17:35:32.846] DEBUG[206558]: res_pjsip_pubsub.c:4237 pubsub_on_rx_refresh: Called here2
> [2023-03-15 17:35:32.846] <--- Transmitting SIP response (581 bytes) to UDP:192.168.10.106:5060 --->
> [2023-03-15 17:35:32.846] SIP/2.0 200 OK
> [2023-03-15 17:35:32.846] Via: SIP/2.0/UDP 192.168.10.106;rport=5060;received=192.168.10.106;branch=z9hG4bK489e9c0651EAD9AD
> [2023-03-15 17:35:32.846] Call-ID: e9b6fbd1-d4bb662e-1c7a7f25 at 192.168.10.106
> [2023-03-15 17:35:32.846] From: "401" <sip:PolycomSCA1 at 192.168.10.143>;tag=19AED809-557F37D6
> [2023-03-15 17:35:32.846] To: <sip:PolycomSCA1 at 192.168.10.143>;tag=86490954-62cb-4ea1-bcc6-d1297cfe3351
> [2023-03-15 17:35:32.846] CSeq: 3 SUBSCRIBE
> [2023-03-15 17:35:32.846] Expires: 0
> [2023-03-15 17:35:32.846] Contact: <sip:192.168.10.143:14444>
> [2023-03-15 17:35:32.846] Allow: OPTIONS, REGISTER, SUBSCRIBE, NOTIFY, PUBLISH, INVITE, ACK, BYE, CANCEL, UPDATE, PRACK, MESSAGE, REFER
> [2023-03-15 17:35:32.846] Supported: 100rel, timer, replaces, norefersub
> [2023-03-15 17:35:32.846] Server: Asterisk
> [2023-03-15 17:35:32.846] Content-Length: 0
> [2023-03-15 17:35:32.846]
> [2023-03-15 17:35:32.846]
> [2023-03-15 17:35:32.847] <--- Transmitting SIP request (677 bytes) to UDP:192.168.10.106:5060 --->
> <snip>
> [2023-03-15 17:35:32.847]
> [2023-03-15 17:35:32.847]
> [2023-03-15 17:35:32.847] DEBUG[206558]: res_pjsip_pubsub.c:3959 pubsub_on_evsub_state: evsub 0x7f4824060bf8 state TERMINATED event USER sub_tree 0x7f482 40601e0 sub_tree state TerminateInProgress
> [2023-03-15 17:35:32.847] DEBUG[206558]: res_pjsip_pubsub.c:3981 pubsub_on_evsub_state: Called here2
> [2023-03-15 17:35:32.898] <--- Received SIP response (545 bytes) from UDP:192.168.10.106:5060 --->
> [2023-03-15 17:35:32.898] SIP/2.0 200 OK
> PJPROJECT compile time config currently running against:
> PJ_CC_NAME/VER_(1,2,3) : gcc-10.2.1
> PJ_DEBUG : 0
> PJ_EXCEPTION_USE_WIN32_SEH: 0
> PJ_FUNCTIONS_ARE_INLINED : 0
> PJ_HAS_EVENT_OBJ : 1
> PJ_HAS_EXCEPTION_NAMES : 1
> PJ_HAS_FLOATING_POINT : 1
> PJ_HAS_HIGH_RES_TIMER : 1
> PJ_HAS_INT64 : 1
> PJ_HAS_IPV6 : 1
> PJ_HAS_PENTIUM : 0
> PJ_HAS_POOL_ALT_API : 0
> PJ_HAS_SEMAPHORE : 1
> PJ_HAS_TCP : 1
> PJ_HAS_THREADS : 1
> PJ_IOQUEUE_HAS_SAFE_UNREG : 1
> PJ_IOQUEUE_MAX_HANDLES : 5000
> PJ_IS_(BIG/LITTLE)_ENDIAN : little-endian
> PJ_LOG_MAX_LEVEL : 6
> PJ_LOG_MAX_SIZE : 4000
> PJ_LOG_USE_STACK_BUFFER : 1
> PJ_LOG_USE_STACK_BUFFER : 1
> PJ_MAX_EXCEPTION_ID : 16
> PJ_MAX_HOSTNAME : 256
> PJ_M_NAME : x86_64
> PJ_OS_HAS_CHECK_STACK : 0
> PJ_OS_NAME : x86_64-unknown-linux-gnu
> PJ_POOL_DEBUG : 0
> PJ_TIMESTAMP_USE_RDTSC: : 0
> PJ_VERSION : 2.13
> ioqueue type : epoll[0x3]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)
More information about the asterisk-bugs
mailing list