[asterisk-users] asterisk 16.5 / pjsip outage because of task processor queue >= 500 tasks and too many open files later on
Michael Maier
m1278468 at mailbox.org
Sat Aug 17 00:27:48 CDT 2019
Hello!
I encountered an outage of asterisk which showed like that:
- 2019-08-10 07:22:21 Asterisk start
- 2019-08-15 19:39:33 WARNING taskprocessor.c: The 'pjsip/outreg/ispPJSIP-00000060' task processor queue reached 500 scheduled tasks.
- 2019-08-15 19:39:34 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-15 19:57:19 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-15 22:59:59 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:28:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:29:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:30:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:31:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:32:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:33:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
- 2019-08-16 08:34:04 WARNING pjproject: SSL STATUS_FROM_SSL_ERR (status): Level: 0 err: <33558552> <system library-fopen-Too many open files> len: 0 peer: ISP-IP:5061
- 2019-08-16 08:34:04 ERROR pjproject: ssl0x7fbf92d096f0 Error loading CA list file '/etc/pki/tls/certs/ca-bundle.crt': Too many open files
- 2019-08-16 08:34:04 WARNING res_pjsip_outbound_registration.c: No response received from 'sip:isp.de' on registration attempt to '..', retrying in '60'
Inound calls via other ISP have been dropped - only those logentries can be seen:
- 2019-08-16 09:25:43 WARNING[27924] res_rtp_asterisk.c: Unable to allocate RTCP socket: Too many open files
- 2019-08-16 09:25:44 WARNING[27924] res_rtp_asterisk.c: Unable to allocate RTCP socket: Too many open files
limits of asterisk:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 7767 7767 processes
Max open files 1024 4096 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 7767 7767 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Memory of the device:
Disk:
7866 MB / (SD card)
771 MB Swap
phys. RAM:
2 GB
Memory consumption of the asterisk process 'ps -C asterisk u':
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
15.08.2019 19:01 asterisk 15910 2.2 12.8 2237024 258960 ? Ssl Aug10 173:59 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
15.08.2019 20:01 asterisk 15910 2.2 12.9 2237024 260016 ? Ssl Aug10 175:06 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
15.08.2019 21:01 asterisk 15910 2.2 12.9 2237024 260280 ? Ssl Aug10 176:11 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
15.08.2019 22:01 asterisk 15910 2.2 12.9 2237024 260544 ? Ssl Aug10 177:16 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
15.08.2019 23:01 asterisk 15910 2.2 12.9 2237024 261072 ? Ssl Aug10 178:21 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 00:01 asterisk 15910 2.2 16.0 2302560 323984 ? Ssl Aug10 179:36 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 01:01 asterisk 15910 2.2 21.4 2368096 431076 ? Ssl Aug10 180:51 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 02:01 asterisk 15910 2.2 26.6 2499168 535784 ? Ssl Aug10 182:06 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 03:01 asterisk 15910 2.2 31.0 2564704 626192 ? Ssl Aug10 183:22 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 04:01 asterisk 15910 2.2 36.6 2695776 738540 ? Ssl Aug10 184:38 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 05:01 asterisk 15910 2.2 41.1 2761312 827892 ? Ssl Aug10 185:55 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 06:01 asterisk 15910 2.2 46.2 2892384 932244 ? Ssl Aug10 187:12 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 07:01 asterisk 15910 2.2 51.4 2957920 1036004 ? Ssl Aug10 188:31 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 08:01 asterisk 15910 2.2 56.2 3088992 1133328 ? Ssl Aug10 189:49 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
16.08.2019 09:01 asterisk 15910 2.2 58.3 3088992 1175064 ? Ssl Aug10 191:07 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
restart
16.08.2019 10:01 asterisk 629 4.9 3.5 2103380 71700 ? Ssl 09:34 1:19 /usr/sbin/asterisk -vvv -mqf -C /etc/asterisk/asterisk.conf
VSZ virtual memory size of the process in KiB (1024-byte units)
RSS resident set size, the non-swapped physical memory that a task has used (in kiloBytes).
=> Memory usage had been rapidly growing since 2019-08-16 00:00:01
Few words about the usage of asterisk:
- 2 registered endpoints
- 4 SIPS / SRTP trunks
- 46 calls at 2019-08-15
- the sip:isp.de trunk hadn't been used
Some findings:
- The problem seems to be triggered by the "task processor queue reached 500 scheduled tasks" problem. I saw this message in June, too.
Context each time was the same ISP (-> not Deutsche Telekom!) as described above in conjunction with registration retries, too.
- Registration configuration of this provider:
<Registration/ServerURI..............................> <Auth..........> <Status.......>
==========================================================================================
ispPJSIP/sip:isp.de ispPJSIP Registered
ParameterName : ParameterValue
==============================================================
auth_rejection_permanent : true
client_uri : sip:0049... at isp.de
contact_user : +49...
endpoint : ispPJSIP
expiration : 3600
fatal_retry_interval : 0
forbidden_retry_interval : 10
line : true
max_retries : 10000
outbound_auth : ispPJSIP
outbound_proxy :
retry_interval : 60
server_uri : sip:isp.de
support_path : false
transport : 0.0.0.0-tls
The expiration value given above is not true. isp.de forces ReRegistration each 90s (asterisk does it each 60s if I remember correctly)!
Contact: <sip:+49... at isp-ip:5061;transport=TLS;line=....>;expires=90
- After performing the restart of asterisk, registration to the isp.de hadn't any problem any more. Therefore I think,
the reregistration problem wasn't a problem of the provider not answering but in fact a problem of asterisk being unable to correctly perform the ReRegistration.
The final question:
===================
Is there a problem with taskprocessors probably not being canceled on some conditions (maybe unanswered or hanging registrations?) and therefore steadily growing up and using more and more open files (and memory) until asterisk can't do
anything anymore because some limits are exceeded as a result?
Could there be a problem with the retry interval 60s and the real ReRegister done each 60s, too (performing a "fork" bomb)?
Thanks
Michael
More information about the asterisk-users
mailing list