Perl optimisation help

Peter plp-ysDPMY98cNQDDBjDh4tngg at public.gmane.org
Thu Jun 8 21:47:14 UTC 2006


On Thu, 8 Jun 2006, Lennart Sorensen wrote:

> On Thu, Jun 08, 2006 at 11:07:45PM +0300, Peter wrote:
>> I did it with a hash and it is 5 times faster. Still I would like to
>> know what the fastest way to do $var=$var.$add; is.
>
> Perl don't have someting icky like $var .= $more; does it?

Not that I know of it.

> The problem is likely that when a string is created a certain amount of
> space is allocated, but when adding to it, often you may have to copy
> the whole string into a new bigger memory block before adding the new
> stuff in.  Not sure how arrays are managed in perl, like do they have to
> be copied and reallocated when stuff is added.  Can you predefine how
> many entries it should have?  Does join do all the preallocation at
> once, or does it just expand to a dumb loop doing the same thing you
> were already doing?

In C one allocates a buffer with room to spare and fills it by adding 
strings (never losing the length, a pointer points to the last byte put 
in, and a counter keeps track of the length). If one runs out of space, 
one reallocates the string in place. This is done rarely (say once in a 
thousand writes) when correctly set up. The speed can be amazing.

> How do you do it with a hash?

%hash=();
$idx=0;

cond loop {
   $hash{$idx++}=$line;
}

$res='';
$i=0;
while($i<$idx) {
   $res = $res.$hash{$i++};
}

this is fast, and could be faster by adding buckets. I think that Perl 
measures the length of the string and caches it. If the next access is 
'soon enough' the length must not be calculated again. The speedup 
factor is between 6 and 13 times (!) vs. using '.' standalone.

join is extremely slow. I will try it again in a different config. Using 
join is a little like using '.' I think.

Peter
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list