[asterisk-users] Tips & best practices for asterisk troubleshooting & parsing logs
Douglas Mortensen
doug at impalanetworks.com
Fri Oct 28 12:57:44 CDT 2011
Sammy, thanks for the response. So based on your recommendation, does this mean that all log lines relating to a given call will retain the same Channel Identifier String for the entire life of the call, even as it moves from the external SIP trunk provider, our queue, an agent who answers, and even another agent who the call gets transferred to? I was under the impression that the call could get passed around from one channel to another. And as a result, following a certain channel id in the logs would really only show me a part of the call, but not the entire life of the call. Am I wrong here?
I haven't really looked for whether tools like swatch, splunk, etc have asterisk-specific rules, interfaces, parsing, etc. I guess I could check.
Also, what do you think of using the asterisk SIP Debugging? Has that been helpful to you, or has the normal asterisk logging been sufficient? One thing that I feel would be more helpful with the SIP Debugging is if it would show the IP addresses the sip messages are being sent to. I guess that's what a packet analyzer (wireshark) is for....
Thanks,
-
Doug Mortensen
Network Consultant
Impala Networks
P: 505.327.7300
From: Sammy Govind [mailto:govoiper at gmail.com]
Sent: Wednesday, October 26, 2011 11:09 PM
To: Asterisk Users Mailing List - Non-Commercial Discussion
Subject: Re: [asterisk-users] Tips & best practices for asterisk troubleshooting & parsing logs
It was a challenge to read through all the interesting experience you've shared over here. I don't know what others may be using for parsing the logs beautifully and make them usable. What I would recommend you at the very beginning ,since you mentioned using egrep, is figure out the Channel identifier string from the logs for a particular call. That's underlined below for you.
[Oct 26 17:58:01] VERBOSE[14274] logger.c: -- Executing [s at tc-maint:3] System("Local/s at tc-maint-2496,2","/var/lib/asterisk/bin/schedtc.php 60 /var/spool/asterisk/outgoing 0") in new stack
Once you Figure out this part use egrep tool and you'll end up seeing only the data related to this particular call.
More advanced tool or techniques may involve setting up a central logging server where all the other servers deposit their logs and use monitoring tools like swatch, splunk, zabbix etc etc etc to parse the logs for you and generate alerts.
I haven't came across any Asterisk-specific log parser utility so far. Honestly, I never needed one.
On Thu, Oct 27, 2011 at 5:16 AM, Douglas Mortensen <doug at impalanetworks.com<mailto:doug at impalanetworks.com>> wrote:
Hello all,
I have been running asterisk systems since summer of 2008. I do not claim to be an expert. But I have worked through many issues during this period. I have setup & manage 5 systems, which serve 6 companies total (and of course process calls for all of the people they do business with).
I have always been happy with asterisk (well, obviously less happy during the problem times... :-). And I continue to prefer to us it. However, if I could name the one largest struggle that I have with asterisk, it is the facilities that it provides for troubleshooting issues & parsing logs.
I am hoping that someone on this mailing list can help me to realize how ignorant I really am, and how much time I have wasted parsing, "grep"ping & "less"ing logs manually. I am hoping that one of you can help me "see the light". If so, I would be most grateful.
Specifically, here are the challenges I encounter, which I would desperately appreciate help with:
Here's an example scenario:
A customer calls me & says that a call just came in & some of their wireless DECT phones (I know, trouble already.... :-) didn't ring, while others did. I tell the customer that I'll start looking into the problem immediately.
I am using AsteriskNOW with asterisk 1.6. So I SSH into the system & cd to /var/log/asterisk & start looking at the "full" log via "less". We have configured the bulk of our system via FreePBX 2.9. Inbound calls are routed first to a time condition which checks whether it is after hours. If it is not afterhours, then are then routed to a queue, which rings all phones (4 wireless DECT phones on 1 DECT wireless server that registers the SIP extensions on behalf of its 4 phones, and 4 more wireless DECT phones on their own wireless server configured the same, and an ATA connected to a paging amp that rings a loud speaker). From there, someone typically will answer the call. Often times they then transfer the call to another extension. However, sometimes no one answers the call, and it winds up going to VM.
>From the logging aspect of asterisk, it has usually felt like I am trudging through a swampy marsh trying to put the bits & pieces together. The challenge I've seen is that the above scenario can actually consist of multiple SIP calls w/ different legs. I *think* (but am not 100% sure) that often times a call can be handed off from 1 asterisk process to another. The result is that "grep"ping by the asterisk process ID shown after the VERBOSE (or NOTICE or DEBUG section [see below]), I don't actually get to see the full sequence of events in following all logging that is relevant to that phone call.
[Oct 26 17:58:01] VERBOSE[14274] logger.c: -- Executing [s at tc-maint:3] System("Local/s at tc-maint-2496,2","/var/lib/asterisk/bin/schedtc.php 60 /var/spool/asterisk/outgoing 0") in new stack
Then on a busy asterisk system, if I filter by the process id, the one process that starts handling the call originally, may wind up immediately taking on another totally unrelated call after handing the initial call off to another process. If I am not extremely careful, I may wind up mistaking the log lines for the 2nd call, as being a part of the 1st call, and then I'm totally barking up the wrong tree.... :-)
Another option I've tried is to enable SIP debugging. Generally, I do like this. And one nice thing is that asterisk seems to usually add the SIP "Date:" parameter with its SIP invites, etc. The result is that I can grep the asterisk log like this `egrep -v ^"\[" full` (SIP debug lines don't have the standard timestamp at the beginning) and then I'm only seeing the SIP debugging, in a pretty clean output. Still, there can be a LOT of SIP traffic going on, when I'm ringing 9 different extensions from a queue. Trying to parse it all can make me go cross-eyed. :-) And doing so can take a LONG time (30+ minutes). The SIP debug lines don't necessarily tell me which call they correspond with. So if there are multiple calls going on or ringing at the same time, this can really get hairy.
So more or less, I think I've sort of established the type of experience I have with parsing the asterisk logging in troubleshooting issues. Ultimately, I usually can get to the bottom of it, but it literally probably takes 45-60+ minutes of just parsing through the log files before I even get to the point, of being able to answer some seemingly simple questions.
Am I going about this the wrong way? Please say yes. Because this is literally probably the only challenge (& sometimes frustration) I have with asterisk. Are there [much] better ways to go about this?
I did a bit of searching online, and it seems that some people have attempted to create tools to parse & display the asterisk full log in a more structured simplified manner. But my initial impression was that these were more of "good ideas" that people started on, but they haven't really matured to the point of being really helpful with these types of situations.
A part of me just thinks that it just shouldn't be so hard to be able to tell whether a certain extension did actually get a SIP INVITE sent to it, or whether it responded, or to simply just follow the chain of the route that a call took through asterisk prior to running into trouble (or just where it ended up).
It seems to me that there should be some powerful tool that does the hard work of parsing the related log lines for us, so that we can just *quickly* review its output & see the sequence of events to determine exactly what happened.
Only when we find out what happened, can we even start working on the solution.
If you have some insight into reviewing, filtering & parsing asterisk logs in the process of troubleshooting issues, I would GREATLY appreciate it.
What approach would you take? What are the best practices, tips & secrets that you've developed?
Thanks all!
-
Doug Mortensen
Network Consultant
Impala Networks Inc
CCNA, MCSA, Security+, A+
Linux+, Network+, Server+
A.A.S. Information Technology
.
www.impalanetworks.com<http://www.impalanetworks.com>
P: (505) 327-7300<tel:%28505%29%20327-7300>
F: (505) 327-7545<tel:%28505%29%20327-7545>
--
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
New to Asterisk? Join us for a live introductory webinar every Thurs:
http://www.asterisk.org/hello
asterisk-users mailing list
To UNSUBSCRIBE or update options visit:
http://lists.digium.com/mailman/listinfo/asterisk-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20111028/fd2f38dd/attachment.htm>
More information about the asterisk-users
mailing list