[asterisk-dev] [Code Review] 3668: refcounter.py uses excessive RAM when processing large refs file

Thu Jun 26 02:43:19 CDT 2014

> On June 25, 2014, 1:54 p.m., wdoekes wrote:
> > /branches/1.8/contrib/scripts/refcounter.py, line 78
> > <https://reviewboard.asterisk.org/r/3668/diff/1/?file=60481#file60481line78>
> >
> >     You could attempt to parse this: a number might hash cheaper than a string.
> 
> Corey Farrell wrote:
>     I'd rather not.  I'm really not that good with Python, the goal of this review is to prevent refcounter.py from using 10x more RAM than the size of the file being processed.  I attempted to avoid adding CPU overhead, but reducing CPU usage is not a goal here.
>     
>     OTOH if you tell me how to parse this / use a number for the hash key I'm willing to incorporate your suggestion.

>>> address = '0xdeadbeef'

>>> long(address, 16)
3735928559L

>>> '%x' % long(address, 16)
'deadbeef'

>>> '%x' % long(address[2:], 16)  # the "0x" is optional
'deadbeef'

However, to satisfy my curiosity, I ran a quick test, and it turns out it is a bad idea after all.

http://fpaste.org/113347/raw/

$ python3 hashspeed.py 
string: 32.27
parsed: 37.88

$ python hashspeed.py 
string: 22.68
parsed: 25.96

Ergo: never mind :)

- wdoekes

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3668/#review12311
-----------------------------------------------------------

On June 26, 2014, 12:22 a.m., Corey Farrell wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/3668/
> -----------------------------------------------------------
> 
> (Updated June 26, 2014, 12:22 a.m.)
> 
> 
> Review request for Asterisk Developers and Matt Jordan.
> 
> 
> Bugs: ASTERISK-23921
>     https://issues.asterisk.org/jira/browse/ASTERISK-23921
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> When processing a 212MB refs file, refcounter.py used over 3GB of RAM.  This caused swap thrashing and temporarily froze my system.  The included patch makes the following memory optimizations:
> * skewed and finished object lists are only populated if not disabled
> * lines are saved to each object as the final output line
> 
> Saving the whole lines in output format seems to reduce memory usage by 80-90%.  Ignoring finished/skewed objects caused an additional reduction of about 75% on my system.
> 
> 
> Diffs
> -----
> 
>   /branches/12/contrib/scripts/refcounter.py 417246 
> 
> Diff: https://reviewboard.asterisk.org/r/3668/diff/
> 
> 
> Testing
> -------
> 
> Watched 'top -c' with refcounter.py running on the 212MB refs log.  The highest memory usage I saw was 127MB with '-sn' options and 472MB with full output.
> 
> 
> Thanks,
> 
> Corey Farrell
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20140626/b7ff625b/attachment.html>