[asterisk-users] asterisk 16.5 / pjsip outage because of task processor queue >= 500 tasks and too many open files later on

Michael Maier m1278468 at mailbox.org
Sat Aug 17 00:27:48 CDT 2019


Hello!

I encountered an outage of asterisk which showed like that:


- 2019-08-10 07:22:21 Asterisk start
- 2019-08-15 19:39:33 WARNING taskprocessor.c: The 'pjsip/outreg/ispPJSIP-00000060' task processor queue reached 500 scheduled tasks.

- 2019-08-15 19:39:34 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-15 19:57:19 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-15 22:59:59 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:28:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:29:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:30:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:31:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:32:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'

- 2019-08-16 08:33:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:34:04 WARNING pjproject: SSL STATUS_FROM_SSL_ERR (status): Level: 0 err: <33558552> <system library-fopen-Too many open files> len: 0 peer: ISP-IP:5061
- 2019-08-16 08:34:04 ERROR pjproject: ssl0x7fbf92d096f0 Error loading CA list file '/etc/pki/tls/certs/ca-bundle.crt': Too many open files
- 2019-08-16 08:34:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'

Inound calls via other ISP have been dropped - only those logentries can be seen:
- 2019-08-16 09:25:43 WARNING[27924] res_rtp_asterisk.c: Unable to allocate RTCP socket: Too many open files
- 2019-08-16 09:25:44 WARNING[27924] res_rtp_asterisk.c: Unable to allocate RTCP socket: Too many open files



limits of asterisk:
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             7767                 7767                 processes
Max open files            1024                 4096                 files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       7767                 7767                 signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

Memory of the device:
Disk:
7866 MB / (SD card)
771  MB Swap

phys. RAM:
2    GB


Memory consumption of the asterisk process 'ps -C asterisk u':

		 USER     PID    %CPU %MEM  VSZ     RSS     TTY    STAT START TIME   COMMAND
15.08.2019 19:01 asterisk 15910  2.2  12.8  2237024 258960  ?      Ssl  Aug10 173:59 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
15.08.2019 20:01 asterisk 15910  2.2  12.9  2237024 260016  ?      Ssl  Aug10 175:06 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
15.08.2019 21:01 asterisk 15910  2.2  12.9  2237024 260280  ?      Ssl  Aug10 176:11 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
15.08.2019 22:01 asterisk 15910  2.2  12.9  2237024 260544  ?      Ssl  Aug10 177:16 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
15.08.2019 23:01 asterisk 15910  2.2  12.9  2237024 261072  ?      Ssl  Aug10 178:21 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 00:01 asterisk 15910  2.2  16.0  2302560 323984  ?      Ssl  Aug10 179:36 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 01:01 asterisk 15910  2.2  21.4  2368096 431076  ?      Ssl  Aug10 180:51 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 02:01 asterisk 15910  2.2  26.6  2499168 535784  ?      Ssl  Aug10 182:06 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 03:01 asterisk 15910  2.2  31.0  2564704 626192  ?      Ssl  Aug10 183:22 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 04:01 asterisk 15910  2.2  36.6  2695776 738540  ?      Ssl  Aug10 184:38 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 05:01 asterisk 15910  2.2  41.1  2761312 827892  ?      Ssl  Aug10 185:55 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 06:01 asterisk 15910  2.2  46.2  2892384 932244  ?      Ssl  Aug10 187:12 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 07:01 asterisk 15910  2.2  51.4  2957920 1036004 ?      Ssl  Aug10 188:31 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 08:01 asterisk 15910  2.2  56.2  3088992 1133328 ?      Ssl  Aug10 189:49 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 09:01 asterisk 15910  2.2  58.3  3088992 1175064 ?      Ssl  Aug10 191:07 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
restart
16.08.2019 10:01 asterisk   629  4.9  3.5   2103380 71700   ?      Ssl  09:34   1:19 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf

VSZ       virtual memory size of the process in KiB (1024-byte units)
RSS       resident set size, the non-swapped physical memory that a task has used (in kiloBytes).

=> Memory usage had been rapidly growing since 2019-08-16 00:00:01


Few words about the usage of asterisk:
- 2 registered endpoints
- 4 SIPS / SRTP trunks
- 46 calls at 2019-08-15
- the sip:isp.de trunk hadn't been used


Some findings:

- The problem seems to be triggered by the "task processor queue reached 500 scheduled tasks" problem. I saw this message in June, too.
  Context each time was the same ISP (-> not Deutsche Telekom!) as described above in conjunction with registration retries, too.

- Registration configuration of this provider:

 <Registration/ServerURI..............................>  <Auth..........>  <Status.......>
==========================================================================================

 ispPJSIP/sip:isp.de                                     ispPJSIP          Registered

 ParameterName            : ParameterValue
 ==============================================================
 auth_rejection_permanent : true
 client_uri               : sip:0049... at isp.de
 contact_user             : +49...
 endpoint                 : ispPJSIP
 expiration               : 3600
 fatal_retry_interval     : 0
 forbidden_retry_interval : 10
 line                     : true
 max_retries              : 10000
 outbound_auth            : ispPJSIP
 outbound_proxy           :
 retry_interval           : 60
 server_uri               : sip:isp.de
 support_path             : false
 transport                : 0.0.0.0-tls


The expiration value given above is not true. isp.de forces ReRegistration each 90s (asterisk does it each 60s if I remember correctly)!
    Contact: <sip:+49... at isp-ip:5061;transport=TLS;line=....>;expires=90

- After performing the restart of asterisk, registration to the isp.de hadn't any problem any more. Therefore I think,
  the reregistration problem wasn't a problem of the provider not answering but in fact a problem of asterisk being unable to correctly perform the ReRegistration.




The final question:
===================
Is there a problem with taskprocessors probably not being canceled on some conditions (maybe unanswered or hanging registrations?) and therefore steadily growing up and using more and more open files (and memory) until asterisk can't do
anything anymore because some limits are exceeded as a result?
Could there be a problem with the retry interval 60s and the real ReRegister done each 60s, too (performing a "fork" bomb)?



Thanks
Michael



More information about the asterisk-users mailing list