[asterisk-bugs] [JIRA] (ASTERISK-30482) AudioSocket: Lack of wait in loop causing high CPU usage

Ross (JIRA) noreply at issues.asterisk.org
Sat Apr 1 04:11:03 CDT 2023


     [ https://issues.asterisk.org/jira/browse/ASTERISK-30482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ross updated ASTERISK-30482:
----------------------------

    Description: 
Hi, I have become aware of bug in the recently added AudioSocket functionality of Astierks 18 and above. The issue is specific to the dialplan app variant of the feature and occurs on exactly [this line in apps/app_audiosocket.c|https://github.com/asterisk/asterisk/blob/edd7f1b0605e2840b2e21bf45afa67950dd3f9fe/apps/app_audiosocket.c#L184]. It is caused by a value of 0 being passed as the "ms" parameter to the "ast_waitfor_nandfds()" function. Since this function seems to eventually invoke the underlying OS's "poll()" implementation, the function never actually waits for activity and the loop it resides in runs as fast as possible, usually causesing a single core to reach 100% utilization. 

My personal idea for it's solution was to pass a hard-coded 2000 milliseconds instead, as 2 seconds seemed like a reasonable time to wait for activity to occur on either end (and it was also the timeout value for the initial server "connect()" call).

Also, while I don't know how much of a part this played in the mistake being made, I noticed that [the documentation for the function "ast_waitfor_nandfds()"|https://github.com/asterisk/asterisk/blob/edd7f1b0605e2840b2e21bf45afa67950dd3f9fe/include/asterisk/channel.h#L1992] doesn't mention that the "ms" parameter is an *in* parameter too, determining how long the wait will be as well as how long it _was_. 

I know that support for AudioSocket is under the status of "Community", and I have fixed this rather trivial issue myself, but more importantly I feel, I also cleaned up and fixed a lot of other issues present its code. More specifically, fixing the un-clean app exit issue mentioned [here|https://issues.asterisk.org/jira/browse/ASTERISK-30227], improving logging/error messages, decreasing the number of calls to "read()", shortening the code in general and replacing some of the  programming-tricks with potentially cleaner methods (removing pointer arithmetic, using "htons()" instead of direct bit shifting for endieness swapping). However, I'm not sure of the best way to submit these (all at once or splitting them up) as I just performed them as I worked in my testing environment and am still waiting on my contributor license agreement to be accepted.

Thank you

  was:
Hi, I have become aware of bug in the recently added AudioSocket functionality of Astierks 18 and above. The issue is specific to the dialplan app variant of the feature and occurs on exactly [this line in apps/app_audiosocket.c|https://github.com/asterisk/asterisk/blob/edd7f1b0605e2840b2e21bf45afa67950dd3f9fe/apps/app_audiosocket.c#L184]. It is caused by a value of 0 being passed as the "ms" parameter to the "ast_waitfor_nandfds()" function. Since this function seems to eventually invoke the underlying OS's "poll()" implementation, the function never actually waits for activity and the loop it resides in runs as fast as possible, usually causesing a single core to reach 100% utilization. 

My personal idea for it's solution was to pass a hard-coded 2000 milliseconds instead, as 2 seconds seemed like a reasonable time to wait for activity to occur on either end (and it was also the timeout value for the initial server "connect()" call).

Also, while I don't know how much of a part this played in the mistake being made, I noticed that [the documentation for the function "ast_waitfor_nandfds()"|https://github.com/asterisk/asterisk/blob/edd7f1b0605e2840b2e21bf45afa67950dd3f9fe/include/asterisk/channel.h#L1992] doesn't mention that the "ms" parameter is an *in* parameter too, determining how long the wait will be as well as how long it _was_. 

I know that support for AudioSocket is under the status of "Community", and I have fixed this rather trivial issue myself, but more importantly I feel, I also cleaned up and fixed a lot of other issues present its code. More specifically, fixing the un-clean app exit issue mentioned [here|https://issues.asterisk.org/jira/browse/ASTERISK-30227], improving logging/error messages, decreasing the number of calls to "read()", shortening the code in general and replacing some of the  programming-tricks with potentially cleaner methods (removing pointer arithmetic, using "htons()" instead of direct bit shifting for endieness swapping). However, I'm not sure of the best way to submit these (all at once or splitting them up) as I just performed them as I worked in my testing environment.

Thank you


> AudioSocket: Lack of wait in loop causing high CPU usage
> --------------------------------------------------------
>
>                 Key: ASTERISK-30482
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-30482
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Applications/General
>    Affects Versions: 18.0.0
>         Environment: Linux asterisk-vm 5.10.0-20-amd64 #1 SMP Debian 5.10.158-2 (2022-12-13) x86_64 GNU/Linux running in Oracle VirtualBox.
>            Reporter: Ross
>
> Hi, I have become aware of bug in the recently added AudioSocket functionality of Astierks 18 and above. The issue is specific to the dialplan app variant of the feature and occurs on exactly [this line in apps/app_audiosocket.c|https://github.com/asterisk/asterisk/blob/edd7f1b0605e2840b2e21bf45afa67950dd3f9fe/apps/app_audiosocket.c#L184]. It is caused by a value of 0 being passed as the "ms" parameter to the "ast_waitfor_nandfds()" function. Since this function seems to eventually invoke the underlying OS's "poll()" implementation, the function never actually waits for activity and the loop it resides in runs as fast as possible, usually causesing a single core to reach 100% utilization. 
> My personal idea for it's solution was to pass a hard-coded 2000 milliseconds instead, as 2 seconds seemed like a reasonable time to wait for activity to occur on either end (and it was also the timeout value for the initial server "connect()" call).
> Also, while I don't know how much of a part this played in the mistake being made, I noticed that [the documentation for the function "ast_waitfor_nandfds()"|https://github.com/asterisk/asterisk/blob/edd7f1b0605e2840b2e21bf45afa67950dd3f9fe/include/asterisk/channel.h#L1992] doesn't mention that the "ms" parameter is an *in* parameter too, determining how long the wait will be as well as how long it _was_. 
> I know that support for AudioSocket is under the status of "Community", and I have fixed this rather trivial issue myself, but more importantly I feel, I also cleaned up and fixed a lot of other issues present its code. More specifically, fixing the un-clean app exit issue mentioned [here|https://issues.asterisk.org/jira/browse/ASTERISK-30227], improving logging/error messages, decreasing the number of calls to "read()", shortening the code in general and replacing some of the  programming-tricks with potentially cleaner methods (removing pointer arithmetic, using "htons()" instead of direct bit shifting for endieness swapping). However, I'm not sure of the best way to submit these (all at once or splitting them up) as I just performed them as I worked in my testing environment and am still waiting on my contributor license agreement to be accepted.
> Thank you



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list