[asterisk-bugs] [JIRA] (ASTERISK-26941) ARI WebSocket forcibly closed due to fatal write error on repeated mute/unmute requests

Rusty Newton (JIRA) noreply at issues.asterisk.org
Thu Apr 13 11:22:57 CDT 2017


    [ https://issues.asterisk.org/jira/browse/ASTERISK-26941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=236532#comment-236532 ] 

Rusty Newton edited comment on ASTERISK-26941 at 4/13/17 11:22 AM:
-------------------------------------------------------------------

Are you attempting to reconnect to the websocket?

I'm not the best for debugging ARI, but after talking with the developers it seems the behavior you describe is expected.

It is the responsibility of your application to reconnect if it gets disconnected, and a disconnect is expected if the write timeout is exceeded.

{quote}
; Default write timeout to set on websockets. This value may need to be adjusted
; for connections where Asterisk must write a substantial amount of data and the
; receiving clients are slow to process the received information. Value is in
; milliseconds; default is 100 ms.
;websocket_write_timeout = 100
{quote}

You should examine the rate at which you are reading and handling events and adjust the write timeout if necessary.

Your client should reconnect to the websocket if disconnected.

Let me know if I'm making sense. Also, feel free to jump on #asterisk-users or #asterisk-dev on irc.freenode.net if you want to talk about this in real-time. There are a number of regular ARI users and developers in those rooms.


was (Author: rnewton):
Are you attempting to reconnect to the websocket?

I'm not the best for debugging ARI, but after talking with the developers it seems the behavior you describe is expected.

It is the responsibility of your application to reconnect if it gets disconnected, and a disconnect is expected if the write timeout is exceeded.

{quote]
; Default write timeout to set on websockets. This value may need to be adjusted
; for connections where Asterisk must write a substantial amount of data and the
; receiving clients are slow to process the received information. Value is in
; milliseconds; default is 100 ms.
;websocket_write_timeout = 100
{quote}

You should examine the rate at which you are reading and handling events and adjust the write timeout if necessary.

Your client should reconnect to the websocket if disconnected.

Let me know if I'm making sense. Also, feel free to jump on #asterisk-users or #asterisk-dev on irc.freenode.net if you want to talk about this in real-time. There are a number of regular ARI users and developers in those rooms.

> ARI WebSocket forcibly closed due to fatal write error on repeated mute/unmute requests
> ---------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-26941
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-26941
>             Project: Asterisk
>          Issue Type: New Feature
>      Security Level: None
>          Components: Resources/res_http_websocket
>    Affects Versions: 14.3.0
>         Environment: Ubuntu 16.04.2 LTS on Windows Azure VM
>            Reporter: Colin
>            Assignee: Colin
>
> We're testing a Stasis conference application and seeing forced WebSocket closes from res_http_websocket.c. Testing with SIPp, we're calling mute/unmute asynchronously  for batches of callers (POST /channels/\{channelId}/mute). 
> We're running with about 60 SIPp connections. Other batch operations (for example move callers between bridges) complete correctly. This issue seems to be specific to mute and unmute commands. 
> Asterisk does not crash, but the Stasis app effectively becomes moribund. We're muting and unmuting callers as one group, using async tasks which we marshal before any further requests are sent to ARI. 
> We set up a test server to get some forensic information, using SIPp to simulate 60 connections. We are finding that about 10-15 successful mute/unmute operations take place from each group, before the WebSocket and HTTP connections become moribund. I can provide a debug log file. Some salient events that we've picked out when we do a group mute/unmute, however, are:
> 1) A large number of webocket writes (in the order of n x n, where n is the number of clients being muted/unmuted), like:
> {noformat}
> [Apr 10 14:36:29] DEBUG[35342] taskprocessor.c: Taskprocessor 'subm:voice_2-00000065' cleared the high water alert.
> [Apr 10 14:36:29] DEBUG[35342] res_http_websocket.c: Writing websocket string of length 668
> [Apr 10 14:36:29] DEBUG[35342] res_http_websocket.c: Writing websocket text frame, length 668
> {noformat}
> 2) After a small number of successful mute/unmute operations, the websocket faults, and closes
> {noformat}
> Apr 10 14:36:30] DEBUG[35342] res_http_websocket.c: Closing WS with 1011 because we can't fulfill a write request
> [Apr 10 14:36:30] DEBUG[35342] utils.c: Timed out trying to write 
> [Apr 10 14:36:30] DEBUG[35342] res_http_websocket.c: WebSocket connection from 'xxx.xxx.xxx.xxx:18133' forcefully closed due to fatal write error 
> {noformat}
> 3) A large number of attempted writes, followed by deactivation of the Stasis application
> {noformat}
> [Apr 10 14:36:30] DEBUG[35342] res_http_websocket.c: Writing websocket text frame, length 668
> [Apr 10 14:36:30] NOTICE[35342] ari/ari_websockets.c: Problem occurred during websocket write to :18133, websocket closed
> [Apr 10 14:36:30] WARNING[35340] ari/ari_websockets.c: WebSocket read error: Software caused connection abort
> [Apr 10 14:36:30] VERBOSE[35340] stasis/app.c: Deactivating Stasis app 'voice_2'
> {noformat}
> We can replicate this at will, and we have extensive verbose/debug logs if these are needed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list