[asterisk-users] 786 000 files limit Centos 7 - Asterisk keep complaining

Stefan Viljoen viljoens at verishare.co.za
Tue Aug 11 04:00:47 CDT 2015


>> Anybody else ran into this?

>No, but I would ask myself why so many file descriptors are being used.
>It sounds like you have a file descriptor leak (not being closed when
>finished with).

Hi Tony

Thanks for replying.

I suspected something like that, though repeatedly running

lsof | wc -l

Always stays quite low - 100 000 open files, which is still 8 times less
than the system maximum as confirmed by running ulimit -n

I also note that this number will increase to about 125 000 but never go
higher than that, then, as calls hang up, decreate again - during times when
the CLI is spammed with 100s of "broken pipe" errors due to insuffiecient
file descriptors, this number never reaches beyond 125  000 out of the
available 800 000 open files.

>You might also want to look at the output of lsof (or at least some of it)
>to see what all these file descriptors are pointing to, and whether it is
>indeed Asterisk that is consuming them.

If I grep by asterisk on the output of lsof the few thousand lines I have
looked at all seem to indicate legitimate uses - there are at least two
files for each conversation in progress (I assume for inward and outward
RTP) plus one for each file being mixmonitored (which also seems logical)
and also number-of-active-calls connections to res_timing_dahdi - which all
looks correct...

>If it is Asterisk, it's quite possible, even probable, that such a leak
>has been found and fixed, even in the 1.8 series. 1.8.11.0 is rather old -
>the latest is 1.8.32.3, so it would be best to update to that version and
>see if the problem persists.

Ok, I will have to consider that. The thing is the problem is not consistent
- I can (for example) run 60 calls, with no problems and no reported
failures in opening files, then calls will -decrease- to about 40 and then
later spike to 70, but around 50 calls I get the errors coming up thousands
of times in the CLI, then suddenly stop as the calls -increase- which
doesn't make sense. But this kind of behaviour does seem consistent with a
possible leak.

SOMETHING NEW

I have now ran

/usr/bin/prlimit --pid `pidof asterisk` 

and I have noticed that even though I have 800 000 files specified, the
ACTUAL limit in place on Asterisk for numbers of files is only 1024?!

# prlimit --pid `pidof asterisk`
RESOURCE   DESCRIPTION                             SOFT      HARD UNITS
AS         address space limit                unlimited unlimited bytes
CORE       max core file size                 unlimited unlimited blocks
CPU        CPU time                           unlimited unlimited seconds
DATA       max data size                      unlimited unlimited bytes
FSIZE      max file size                      unlimited unlimited blocks
LOCKS      max number of file locks held      unlimited unlimited
MEMLOCK    max locked-in-memory address space     65536     65536 bytes
MSGQUEUE   max bytes in POSIX mqueues            819200    819200 bytes
NICE       max nice prio allowed to raise             0         0
NOFILE     max number of open files                1024      4096
NPROC      max number of processes                30861     30861
RSS        max resident set size              unlimited unlimited pages
RTPRIO     max real-time priority                     0         0
RTTIME     timeout for real-time tasks        unlimited unlimited microsecs
SIGPENDING max number of pending signals          30861     30861
STACK      max stack size                       8388608 unlimited bytes

Accordingly I have put this into a cronjob ran each minute:

prlimit --pid `pidof asterisk` --nofile=786000:786000

to try and force the running binary to keep a high file limit (sources say
to keep it less than the ACTUAL system file limit, in my case 800 000 files)
on the live Asterisk process.

I'll see if this maybe helps - the above runs via cron each minute.

So it appears for some reason somehow the live running asterisk process
"loses track" of how many open files it may have, or when it starts it
somehow does not start with the correct number of maximum open files, as set
in the system / kernel config?

Anyway, thank you for replying, I'll monitor this new "Cronjob fixup" I'm
trying and see if it helps.

No wonder it is complaining about running out of file handles if it ACTUALLY
was only using 1024!

Kind regards

Stefan




More information about the asterisk-users mailing list