[asterisk-dev] slow slow menuselect (and proposal for a fix)

Sat Dec 15 13:19:53 CST 2007

On Sat, Dec 15, 2007 at 10:07:32AM -0800, Luigi Rizzo wrote:
> On Sat, Dec 15, 2007 at 05:08:05PM +0200, Tzafrir Cohen wrote:
> > On Sat, Dec 15, 2007 at 06:33:13AM -0800, Luigi Rizzo wrote:
> > > I experienced recently that menuselect and related processing is
> > > taking a long long time on certain platforms, particularly if you
> > > are using network-mounted filesystems, so investigated a bit the
> > > problem.
> > > 
> > > The reason seems to be that, if i am not mistaken, menuselect itself
> > > does a full pass on the entire set of sources to check dependencies.
> > > There is a second full pass, with awk being run on each source file
> > > while making up menuselect-tree.  This seems to happen every time
> > > I run 'make' in the top level, even without any change in the
> > > sources.
> > > 
> > > Remember we have over 300 files and 310k lines of C code now in the tree,
> > > which are not a problem on a fast modern machine with a fast local
> > > disk and large disk cache. Take one of these features away and
> > > you'll see a huge difference.
> > > 
> > > The attempt to put in the same file (the .c source) code, documentation
> > > and build dependencies is commendable. But in my opinion it is completely
> > > unreasonable that we have to process 300 files and 310k lines of code
> > > every time we run 'make', only to collect the 1000 or so lines
> > > that we care about (and find that they are unchanged).
> > > 
> > > I see two steps to remove this inefficiency:
> > >  
> > > 1. (simple): remove the dependency on menuselect.makeopts from a number
> > >         of targets, introducing it only as a dependency for makeopts
> > > 
> > > 2. (slightly harder): remove the MODULEINFO and MAKEOPTS sections
> > >         from foo.c source files, and put them somewhere else, either
> > >         in some foo.opts files that go along with them, and/or in a central
> > >         place for each directory
> > 
> > 3. extracting the information at ./configure time rather than at 'make'
> > time.

If the scanning is too slow, maybe it could be optimized. One way is to
replace everything with a perl/python/whatever script. Such a script
will not execute multiple binaries mutiple tiomes. Currently for every
file there are several invocations of printf and awk.

An obvious optimization is to replace all of that with one perl script.
Given that some people would disapprove of adding an extra dependency on
perl, here some (g?)awk command to extract the data from all files:

time awk 'FNR==1 {printf "\n\nFilename: %s\n",FILENAME}; /\/\*\*\* MODULEINFO/,/\*\*\*\// {print};' */*.c  >version2

real    0m2.047s
user    0m0.892s
sys     0m0.044s

The result:

Filename: agi/strcompat.c

Filename: apps/app_adsiprog.c
/*** MODULEINFO
        <depend>res_adsi</depend>
 ***/

Filename: apps/app_alarmreceiver.c

Now we have all the data in one file. There's a clear separator
(^Filename, and also an empty line). That's easy to parse by the next
stage of the pipe. I figure that the next stage would also get rid of
'\*\*\*' lines.

> 
> which is very close to #1, except that ./configure is also slow as hell
> on Windows :( 

But you don't need to rescan often. You know when you have an extra
module. 

We can separate this to an extra step, invoked explicitly by configure
but that can also be invoked independenly. 

> 
> > (2) breaks the option to just drop a file into the build tree and get it
> > built.

If it's too slow, maybe it could be optimized

> 
> i know - all of these options to some degree prevent automatic
> handling of files stuffed into the tree. But otherwise full dependency
> checking is too expensive to be run every time. Of course we must
> provide commands to run such checks (./configure, menuselect/makeopts,
> ...) but the 'common' make targets should limit themselves to handle
> 'common' situation i.e. changes to individual sources, not addition
> of removal of entire files and modules.

It is not so expensive. It is just not optimized right now.

(and that said: I still don't see any benefit from a the XML format vs.
a simpler and easier-to-parse format for our relatively simple data)

-- 
               Tzafrir Cohen
icq#16849755              jabber:tzafrir.cohen at xorcom.com
+972-50-7952406           mailto:tzafrir.cohen at xorcom.com
http://www.xorcom.com  iax:guest at local.xorcom.com/tzafrir