[Asterisk-code-review] Core/Stasis: Fix "FRACK!, Failed assertion bad magic number" happens ... (asterisk[16])

Joshua Colp asteriskteam at digium.com
Mon May 24 04:33:13 CDT 2021


Joshua Colp has posted comments on this change. ( https://gerrit.asterisk.org/c/asterisk/+/15762 )

Change subject: Core/Stasis: Fix "FRACK!, Failed assertion bad magic number" happens when unsubscribe an application from an event source
......................................................................


Patch Set 2:

> Patch Set 2:
> 
> The fact that the use of the RAII_VAR macro is not obvious, and that this point should be taken into account, I wrote in the commentary to the error [ASTERISK-28237].
> 
> As per Comment 1 (function messaging_app_unsubscribe_endpoint):
> 
> If we remove the additional call to the function ao2_ref (sub, -1); then a memory leak will occur. Let's consider the algorithm in order:
> 
> 1. At the very beginning of the messaging_app_unsubscribe_endpoint function, the RAII_VAR macro is defined, which, if the executable code leaves the scope of the function, will call the cleanup function (ao2_cleanup), which, in turn, will decrease the counter of references to the object by 1. In the definition of the macro to control the counter of references to object (ao2_ref) (and the corresponding function), it can be seen that the object will self-destruct if the value of the reference count, during the next call, becomes equal to 0. The above indicates that if it is necessary that the object being processed is not removed from memory, then the resulting from the outside, the object must have a reference count equal to 2. If it is necessary for the object being processed to be removed from memory, then it is necessary to additionally decrease the reference count of the object received from the outside by 1. Moreover, before the object is "deleted", it is necessary that all stored pointers to this object were zeroed, also the pointer to this object should be deleted from all stored lists in order to avoid possible subsequent exceptions. Note that I did not come up with this approach to managing objects, I only adhere to this rule.
> 
> 1.1. In addition, consider the fact that COM has a similar reference counting model:
> 1.1.1. The new, created object must have a reference count of 1. That is, it is assumed that the object is referenced by "someone" from the calling code.
> 1.1.2. The reference count of an object must be equal to the number of references to that object. That is, if an object is referenced by two other objects, then the reference count must also be 2.
> 1.1.3. When adding a reference to an object, the reference count of the object must be incremented.
> 1.1.4. When deleting a reference to an object, the reference count of the object must be decremented.
> 1.1.5. The object must have a mechanism for checking the value of the counter and a mechanism for self-destruction in case of detection of a reference counter equal to 0.
> 1.1.6. If the function processing the data contains the RAII_VAR macro for the processed one, then it is necessary that the function of receiving this object should return an object with a reference count greater than or equal to 2.
> 
> 1.2. In a general sense, the algorithm of the messaging_app_unsubscribe_endpoint function is as follows:
> 1.2.1. We define an auto-function on an object (RAII_VAR), which is triggered if the current function is exited.
> 1.2.2. We get an object with a temporarily incremented counter by 1, the called function should take care of this.
> 1.2.3. We process the object. As a result, upon exiting the function, the object's reference counter is decremented by 1.
> 
> 2. The sub object is obtained by calling the get_subscription function. Let's take a quick look at the get_subscription algorithm:
> 2.1. In the case of searching for a resource and finding an object, the ao2_find macro will return an object with a reference count of 2
> 2.2. In the case of searching by technology and finding an object, the object will also have a reference count of 2 (thanks to the ao2_bump call)
> 2.3. As a result, we have a subscription object with a reference count equal to or more than 2 (in accordance with paragraph 1.1).
> 
> 3. Next comes the check for the presence of a pointer to the subscription object. If the pointer is zeroed, then the function will be exited - normal behavior.
> 
> 4. The next step is to check whether the application has a subscription (is_app_subscribed). If the application does not have a subscription, then the function will be exited - normal behavior, since in this case it is not necessary to delete the received subscription object, but the reference count of this object will be decreased by 1 (see paragraph 1).
> 
> 5. Next comes the removal of the structure of information about the signed application from the list of those in the subscription object.
> 
> 6. Next comes the check of the number of remaining applications in the subscription. There is a point here that must be taken into account in accordance with paragraph 1.1:
> 6.1. If the number of remaining applications in the subscription is more than 0, then go to paragraph 7.
> 6.2. If the number of remaining applications for a subscription is 0, then, depending on the type of subscription, resource or technology, the pointer to the subscription object will be deleted either from the special resource container (endpoint_subscriptions), if there was one, or from the list of technologies (tech_subscriptions_lock), if there was one. But:
> 6.2.1. The block for removing the subscription pointer from the list of technologies (tech_subscriptions_lock) does nothing with the reference count, whereas:
> 6.2.2. In the block for deleting the subscription pointer from a special resource container (endpoint_subscriptions), thanks to the ao2_unlink call, the subscription reference counter is decreased by 1. That is why further in the code block, ao2_bump call was added, which increases the subscription reference counter back by 1, which is described in the comment above the call to ao2_bump.
> 
> 7. Next, we decrease the reference counter by 1 at the subscription object (call ao2_ref (sub, -1)), since previously removed one application from the list of those (see paragraph 1.1).
> 
> 8. Next, the function will be exited and the counter of references to the object will be decreased by 1 again (see paragraph 1).
> 
> As per Comment 2 (function messaging_app_subscribe_endpoint):
> 
> If you remove the proposed block of code, then when you subscribe more than one application and then unsubscribe the first and then the second application, an exception will be thrown (an attempt to access a non-existent object). Further explains why:
> 
> 1. As in the answer to the first comment, the RAII_VAR macro is defined for the subscription object.
> 
> 2. Next, we try to get the subscription object by calling get_or_create_subscription. In any case, whether an object is found in the storage, or a new object is created, the subscription reference counter will be equal to or greater than 2 (Do not forget that upon exiting the function, the subscription reference counter will be decreased back by 1).
> 
> 3. Next comes the check for the presence of a pointer to the subscription object. If the pointer is zeroed, then the function will be exited - normal behavior.
> 
> 4. The next step is to check whether the application has a subscription (is_app_subscribed). If the application already has a subscription, then the function will be exited - normal behavior, the reference counter of this object will be decreased by 1 (2-1=1) (see paragraph 1).
> 
> 5. Next is the creation of a new structure of information about the application (thanks to the call to application_tuple_alloc).
> 
> 6. Next, the new structure is added to the list of signed applications of the subscription object (sub-> applications).
> 
> 7. The next step is to check if the subscription object has more than one application. If the number of subscribed applications is greater than 1, then the subscription reference count is increased. Here's an interesting point:
> 7.1. If you do not do such a check (as it was before), then when you try to unsubscribe the second application in turn, an exception will occur, because the pointer to the subscription object will still be present in the storage (see answer to Commentary 1, paragraph 6), and the object will already be deleted due to the zeroing of the reference count. That is, in fact, two or more applications are using the subscription object, and the reference count of the subscription object would remain equal to 1 (which violates the rules described in paragraph 1.1) in the case of signing. If you unsubscribe in turn, the first call will self-destruct the subscription object, the second call will raise an exception when trying to access a non-existent object.
> 7.2. Since when creating a new subscription object, the reference count of this object is always 1 (in the case of using RAII_VAR, for simplicity, you can abstract from temporary +1 when receiving the object, and -1 when exiting the function), there is no need to increase the reference count to 1, because in this case, the subscription object contains 1 signed application in its list, and the reference count of the subscription object is 1.
> 7.3. When receiving a subscription object from the storage and then signing a new application, the subscription object will already contain more than one application in its list, respectively, it is necessary to increase the counter of links to the subscription object by 1. The reference count of the subscription object is incremented, with each call of the current function, by 1, only if the number of applications in the subscription object is more than 1.

Regarding Comment 1 then the "ao2_ref(sub, -1);" is actually for the subscription that was present in the tech_subscriptions vector, and it should live in that code block along with a comment stating as such. The code should also only perform the unreference if a subscription was actually removed.

Regarding Comment 2 after moving the "ao2_ref(sub, -1);" to the correct place I'm not sure how this would be applicable any longer, as the reference count would now be incremented by 1 when calling get_subscription and decremented by 1 by the RAII_VAR and no longer decremented a second time. While the subscription is present in either endpoint_subscriptions or tech_subscriptions a reference will now be held as was intended.


-- 
To view, visit https://gerrit.asterisk.org/c/asterisk/+/15762
To unsubscribe, or for help writing mail filters, visit https://gerrit.asterisk.org/settings

Gerrit-Project: asterisk
Gerrit-Branch: 16
Gerrit-Change-Id: Ia91b15f8e5ea68f850c66889a6325d9575901729
Gerrit-Change-Number: 15762
Gerrit-PatchSet: 2
Gerrit-Owner: Evgenios Muratidis <jone1984 at hotmail.com>
Gerrit-Reviewer: Friendly Automation
Gerrit-Reviewer: Joshua Colp <jcolp at sangoma.com>
Gerrit-Comment-Date: Mon, 24 May 2021 09:33:13 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: No
Gerrit-MessageType: comment
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-code-review/attachments/20210524/48070e73/attachment.html>


More information about the asterisk-code-review mailing list