[asterisk-users] VERY HIGH LOAD AVERAGE: top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75, 62.55, 55.75
Muro, Sam
research at businesstz.com
Wed Feb 10 01:12:55 CST 2010
>> Hi Team
>>
>> Can someone advice me on how i can lower the load average on my asterisk
>> server?
>>
>> dahdi-linux-2.1.0.4
>> dahdi-tools-2.1.0.2
>> libpri-1.4.10.1
>> asterisk-1.4.25.1
>>
>> 2 X TE412P Digium cards on ISDN PRI
>>
>> Im using the system as an IVR without any transcoding or bridging
>>
>> **************************************
>> top - 10:27:57 up 199 days, 5:18, 2 users, load average: 67.75,
>> 62.55,
>> 55.75
>> Tasks: 149 total, 1 running, 148 sleeping, 0 stopped, 0 zombie
>> Cpu0
>> : 10.3%us, 32.0%sy, 0.0%ni, 57.3%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
>> Cpu1 : 10.6%us, 34.6%sy, 0.0%ni, 54.8%id, 0.0%wa, 0.0%hi, 0.0%si,
>> 0.0%st
>> Cpu2 : 13.3%us, 36.5%sy, 0.0%ni, 49.8%id, 0.0%wa, 0.0%hi, 0.3%si,
>> 0.0%st
>> Cpu3 : 8.6%us, 39.5%sy, 0.0%ni, 51.8%id, 0.0%wa, 0.0%hi, 0.0%si,
>> 0.0%st
>> Cpu4 : 7.3%us, 38.0%sy, 0.0%ni, 54.7%id, 0.0%wa, 0.0%hi, 0.0%si,
>> 0.0%st
>> Cpu5 : 17.9%us, 37.5%sy, 0.0%ni, 44.5%id, 0.0%wa, 0.0%hi, 0.0%si,
>> 0.0%st
>> Cpu6 : 13.3%us, 37.2%sy, 0.0%ni, 49.5%id, 0.0%wa, 0.0%hi, 0.0%si,
>> 0.0%st
>> Cpu7 : 12.7%us, 37.3%sy, 0.0%ni, 50.0%id, 0.0%wa, 0.0%hi, 0.0%si,
>> 0.0%st
>
> System is fairly loaded, but there's still plenty of idle CPU cycles. If
> we were in a storm of CPU-intensive processes, we would have expected
> many more "running" processes. Right now we have none (the single
> process is 'top' itself).
>
>> Mem: 3961100k total, 3837920k used, 123180k free, 108944k buffers
>> Swap: 779144k total, 56k used, 779088k free, 3602540k cached
>>
>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 683
>> root 15 0 97968 36m 5616 S 307.7 0.9 41457:34 asterisk
>> 17176 root 15 0 2196 1052 800 R 0.7 0.0 0:00.32 top
>> 1 root 15 0 2064 592 512 S 0.0 0.0 0:13.96 init
>> 2 root RT -5 0 0 0 S 0.0 0.0 5:27.80 migration/0
>> 3
>
> Processes seem to be sorted by size. You should have pressed 'p' to go
> back to sorting by CPU. Now we don't even see the worst offenders.
>
Tried option 'p' but doesnt seems to exist. Centos 5.3 kernel 2.6.18-128
>
>> root 34 19 0 0 0 S 0.0 0.0 0:00.11 ksoftirqd/0 4
>> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5
>> root RT -5 0 0 0 S 0.0 0.0 1:07.67 migration/1 6
>> root 34 19 0 0 0 S 0.0 0.0 0:00.09 ksoftirqd/1 7
>> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 8
>> root RT -5 0 0 0 S 0.0 0.0 1:16.92 migration/2 9
>> root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/2
>> 10 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/2
>> 11
>> root RT -5 0 0 0 S 0.0 0.0 1:34.54 migration/3 12
>> root 34 19 0 0 0 S 0.0 0.0 0:00.15 ksoftirqd/3 13
>> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3 14
>> root RT -5 0 0 0 S 0.0 0.0 0:54.66 migration/4 15
>> root 34 19 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/4 16
>> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/4 17
>> root RT -5 0 0 0 S 0.0 0.0 1:39.64 migration/5 18
>> root 39 19 0 0 0 S 0.0 0.0 0:00.21 ksoftirqd/5 19
>> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/5 20
>> root RT -5 0 0 0 S 0.0 0.0 1:06.27 migration/6 21
>> root 34 19 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/6 22
>> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/6 23
>> root RT -5 0 0 0 S 0.0 0.0 1:23.24 migration/7 24
>> root 34 19 0 0 0 S 0.0 0.0 0:00.17 ksoftirqd/7 25
>> root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/7 26
>> root 10 -5 0 0 0 S 0.0 0.0 0:25.70 events/0 27 root
>> 10 -5 0 0 0 S 0.0 0.0 0:37.83 events/1 28 root
>> 10 -5 0 0 0 S 0.0 0.0 0:15.67 events/2 29 root 10
>> -5 0 0 0 S 0.0 0.0 0:40.36 events/3 30 root 10 -5
>> 0 0 0 S 0.0 0.0 0:16.45 events/4
>
> Those are all kernel threads rather than real processes.
>
> So I suspect one of two things:
>
> 1. You're right after such a storm. The load average will decreases
> sharply.
What do you mean Trafrir
Its obvious that the effect increases with increase number of active
channels. e.g. @channels=90, load average = 4 but @channels =235, load
average= 60+
>
> 2. There are many processes hung in state 'D' (uninterruptable system
> call). If a process is hung in such a system call for long, it normally
> means a problem. E.g. disk-access issues which causes all processes
> trying to acess a certain file to hang.
I presume this should happen if there is irq sharing between disks and
cards which isnt my case.
>
> --
More information about the asterisk-users
mailing list