Perl optimisation help

Peter plp-ysDPMY98cNQDDBjDh4tngg at public.gmane.org
Fri Jun 9 15:42:21 UTC 2006


On Fri, 9 Jun 2006, Lennart Sorensen wrote:

> On Fri, Jun 09, 2006 at 12:47:14AM +0300, Peter wrote:
>> %hash=();
>> $idx=0;
>>
>> cond loop {
>>   $hash{$idx++}=$line;
>> }
>
> Isn't using a hash with numerical keys rather like using an array except
> possibly less efficient?

Why less efficient ? Arrays with variable length entries have to be 
stored somehow internally (i.e. not as plain arrays). One good way is 
hashes. Another is an array of pointers to typed (with size!) data on 
the heap.

>> $res='';
>> $i=0;
>> while($i<$idx) {
>>   $res = $res.$hash{$i++};
>> }
>>
>> this is fast, and could be faster by adding buckets. I think that Perl
>> measures the length of the string and caches it. If the next access is
>> 'soon enough' the length must not be calculated again. The speedup
>> factor is between 6 and 13 times (!) vs. using '.' standalone.
>>
>> join is extremely slow. I will try it again in a different config. Using
>> join is a little like using '.' I think.
>
> Well something like this might work faster than hashes (not sure which
> are faster):
>
> @array=();
>
> cond loop {
> 	push @array,$line;
> }
>
> $res='';
> foreach $line(@array) {
> 	$res=$res.$line;
> }

This is just like @array = (@array,$line); and it is slow. Slower than 
the other methods. I think that it all revolves around C convention 
storage for strings and Perl throwing away the length of a string if 
enough calculations come in between. Interestingly, $a .= $b; is much 
faster then $a = $a.$b; if used occasionally (i.e. lots of instructions 
in the loop). Both are as fast if used in a tight loop (which Perl 
likely optimizes).

My speed is reasonable now, using .= I go through data at ~600kBytes/sec 
without any unexpected CPU load peaks. A faster CPU would help (this 
runs on a 500Mhz P3 now).

I will put this question on a Perlmongers list when I have time.

thanks, the list really helped,

Peter
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list