[asterisk-dev] Some 1.4.0-beta2 and Solaris 10 issues...

Jason Parker jparker at digium.com
Sat Sep 23 14:36:45 MST 2006


With the locking, the fix will likely be as simple as modifying lock.h. It's something we're aware of, and it is being looked into. 

In trunk/1.4, m4 was not used anywhere in the tree, so it was removed. The rest are now retrieved via autoconf. I do not know how autoconf gets the search paths, I can only assume that it uses the users current PATH - somebody else will have to answer this question in more detail. 

As far as your bug reports go, I'm more than a little confused. Since Zaptel is Linux specific, you can't possibly be the using Zaptel from Digium. Are you perhaps using the one from www.solarisvoip.com? I guess we need a little bit of clarification on that. 

----- Original Message ----- 
From: Bob Atkins <bob at digilink.net> 
To: Asterisk Developers Mailing List <asterisk-dev at lists.digium.com> 
Sent: Saturday, September 23, 2006 1:04:18 PM GMT-0800 
Subject: Re: [asterisk-dev] Some 1.4.0-beta2 and Solaris 10 issues... 

Jason, 

With regard to things like the /usr/xpg4/bin/ issue. I find this item a little surprising given that I had reported it and related issues a while back as bugs that as far as I know had been accepted and incorporated into the Makefile. I suggest that variables be used to define specific executables in the Makefile so that various platform/OS differences can be accommodated rather than relying in the user's (who is compiling Asterisk) search path. 

In the 1.2.12.1 Makefile the following has been defined: 

ifeq ($(OSARCH),SunOS) 
GREP=/usr/xpg4/bin/grep 
M4=/usr/local/bin/m4 
ID=/usr/xpg4/bin/id 
LN=/usr/xpg4/bin/ln 
INSTALL=ginstall 
endif 

Interestingly, the M4 definition is no good since /usr/local is not a standard path on Solaris. It should be /opt/bin/m4 

I have 2 open bugs on Mantis, 7876 and 7875 that affect Solaris portability. BTW, I do have a disclaimer on file. I forgot to check the disclaimer box when I filed bug 7876. 

As a side note, since Solaris uses kernel threads there may be larger threading issues lurking around that may not be as obvious as the one I reported in 7875 for res_musiconhold. The 'hanging' that Lee is reporting when asterisk is loading may actually be thread deadlocks/race conditions that are more deeply rooted than his fix may indicate. 

Since Linux doesn't use kernel based threads it is unlikely that a programmer who is only debugging on Linux will consider the need to call sched_yield() or thr_yield() on Solaris 2.8 when a thread might be delayed by something. However, on Solaris such delays will simply stop the thread and all subsequent threads that are waiting on completion of the stopped thread. The various thread delays may be small enough in most configurations or operational scenarios to not affect the fundamental operation of asterisk when configured say with fewer than X extensions, SIP or IAX peers, voicemail boxes, etc. However, there may be some magic configuration or operational scenario that will expose thread deadlocks/race conditions on Solaris systems because threads are not using calls to sched_yield() (or thr_yield() on Solaris 2.8) before potentially delaying operations or functions. 

Since there has been no action taken on bug 7875 I can only assume that the issue I reported has not been addressed in the 1.4 tree. Since the issue that I reported in 7875 will only become evident if asterisk is compiled with the zaptel drivers it probably isn't being noticed by anyone running on Solaris since I don't think that many people know that there are good zaptel drivers available for Solaris. However, since we are running asterisk 1.2.12.1 with zaptel drivers on Solaris 2.8 we had to solve this most glaring problem with res_musiconhold. However, we have not undertaken a detailed review of all of the asterisk code to see if other thread deadlock/race conditions may exist elsewhere mainly because asterisk has been performing well for us in a production environment on Solaris 2.8. Again, I also think that we haven't hit some magic combination of configuration and operational load that could cause a another potential race condition to be exposed. 

Of course, since we haven't examined all of th asterisk code, what I present here is only a hypothetical concern. Perhaps there are no instances anywhere else in the asterisk code that, like res_musiconhold would cause a thread deadlock/race condition - then again I think it would be time well spent to find out. 

--- 
Bob 


Jason Parker wrote: ----- Lee Essen <lee.essen at nowonline.co.uk> wrote: Hi, I'm not sure if anyone is tracking Solaris support for Asterisk here, but these may be of interest and some of the issues may actually exist on other platforms ... I can raise these all in the bug tracker if needed, so please advise the best thing to do. It's been a bit of a mammoth session trying to get this to work (with the mutex issue being particularly awkward) so I apologise for the length of this ... but it at least builds and runs on Solaris 10, testing actual calls will follow later! :-) Regards, Lee. 1. Linked List Mutex issue -- this looks like a generic problem In lock.h, for static AST_MUTEX's the define (__AST_MUTEX_DEFINE) has a constructor that calls ast_mutex_init() which ensures the type is set to PTHREAD_MUTEX_RECURSIVE, this is all great and works perfectly, however if you look at linkedlist.h you will see that AST_LIST_HEAD_STATIC has an ast_mutex_t field that is given an initial value but not subsequently ast_mutex_init'ed, on Solaris this causes it to not be a PTHREAD_MUTEX_RECURSIVE and causes Asterisk to deadlock on loading modules. As a workaround I've added the constructor and destructor to the AST_LIST_HEAD_STATIC routine and it works fine, but the final solution will need to be somewhat more complex given the arrangement of #ifdef's in lock.h. 2. DEBUG_THREAD and comparing structs?? If you turn on DEBUG_THREAD then lock.h does a number of struct comparisons that are invalid and cause the compilation to fail, I did experiment with using memcmp instead but didn't manage to get it to work properly. For reference these are generally comparisons of t->mutex with either PTHREAD_MUTEX_INITIALIZER or (empty_mutex). 3. Missing cast of PTHREAD_MUTEX_INITIALIZER Still in lock.h, at the end of __ast_pthread_mutex_destroy there is an assignment to t->mutex which fails to compile unless you cast it to (pthread_mutex_t). I think this was only with DEBUG_THREAD set. 4. build_tools/prep_moduledeps ... "-e" On Solaris both grep and echo don't take a "-e" option, the grep fails and the echo causes an extra "-e" in the output. Removing the "-e" seems to work fine (although see later about C++ files.) 5. build_tools/get_moduleinfo and build_tools/get_makeopts On Solaris awk doesn't like empty expressions (//) and fails to build all of the dependency stuff. Removing the // works fine ... I'm not aware of this being a problem with any other awk's so this should be an easy one! 6. menuselect-tree issues with C++ files (vpb and kdeconsole) For C++ source files the menuselect-tree file seems to have an extra "." which causes the dependencies to fail (i.e. "vpb." and "vpb..o" etc.) I haven't dug into this in any detail yet and could be related to the previous issues. 7. install-sh problems By default the install-sh script is picked but used as "./install-sh" which is fine until the make sequence moves into another directory. In addition install-sh uses "mv" by default which moves all the include files out of the way causes later installs to fail. This is easily fixed by using INSTALL="/usr/ucb/install", but it would be good if it detected this automatically. 8. curl include path Not really an asterisk issue, but for some reason the SunFreeWare curl package doesn't return the include info from curl-config (the lib stuff is fine) ... would be nice to have that as a "configure" option -- at the moment I have to hack the makeopts file. 9. "id" command options in Makefile On Solaris the "id" command does not take the "-un" options, the easiest fix here would be to use $LOGNAME or $USER. 10. $includedir in Makefile bininstall In the bininstall section $(DESTDIR)$(ASTHEADERDIR) is created, but $(DESTDIR)$(includedir) which is used immediately after is not created, causing problems if they happen to be different (which they are by default on Solaris.) 11. tar in sounds/Makefile Solaris tar does not take "C" as an option (tar xCf) ... gtar is available and works perfectly. 12. wait4 different usage between Solaris and Linux In main/asterisk.c wait4(-1, ...) is used, on Solaris the same behaviour actually needs a wait4(0,...) -- you'll have lots of AGI zombie processes around if left unchanged. 13. FYI ... Solaris make is rubbish and gmake is too old. By default Solaris has "gmake" (gnu make) v3.80 which fails with a "virtual memory exhausted" error, and the default Solaris "make" doesn't get anywhere. Upgrading to gnu make v3.81 fixes the memory problem. I've been doing a bit of the work to get Asterisk running properly on Solaris, so I've seen a few of the issues you're running into. 4. I thought I had fixed this...guess not. 7. If you install fileutils, it will find and use ginstall instead of install-sh. 8. I've not seen this problem when using the blastwave curl package. 9. The version of id in /usr/xpg4/bin/ does support -un. Is this in your PATH? 10. Do we just need to do a `$(INSTALL) -d $(DESTDIR)$(includedir)` here? 11. This was recently changed, and it will be fixed shortly. 13. You do indeed need gmake 3.81. This requirement may be removed in the future (for tarball releases). Please post a bug report for the rest. It may make sense to group some of them together in one report. 




-- 
Jason Parker 
Digium 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-dev/attachments/20060923/b4610daf/attachment.htm


More information about the asterisk-dev mailing list