Perl optimisation help
Peter
plp-ysDPMY98cNQDDBjDh4tngg at public.gmane.org
Fri Jun 9 15:42:21 UTC 2006
On Fri, 9 Jun 2006, Lennart Sorensen wrote:
> On Fri, Jun 09, 2006 at 12:47:14AM +0300, Peter wrote:
>> %hash=();
>> $idx=0;
>>
>> cond loop {
>> $hash{$idx++}=$line;
>> }
>
> Isn't using a hash with numerical keys rather like using an array except
> possibly less efficient?
Why less efficient ? Arrays with variable length entries have to be
stored somehow internally (i.e. not as plain arrays). One good way is
hashes. Another is an array of pointers to typed (with size!) data on
the heap.
>> $res='';
>> $i=0;
>> while($i<$idx) {
>> $res = $res.$hash{$i++};
>> }
>>
>> this is fast, and could be faster by adding buckets. I think that Perl
>> measures the length of the string and caches it. If the next access is
>> 'soon enough' the length must not be calculated again. The speedup
>> factor is between 6 and 13 times (!) vs. using '.' standalone.
>>
>> join is extremely slow. I will try it again in a different config. Using
>> join is a little like using '.' I think.
>
> Well something like this might work faster than hashes (not sure which
> are faster):
>
> @array=();
>
> cond loop {
> push @array,$line;
> }
>
> $res='';
> foreach $line(@array) {
> $res=$res.$line;
> }
This is just like @array = (@array,$line); and it is slow. Slower than
the other methods. I think that it all revolves around C convention
storage for strings and Perl throwing away the length of a string if
enough calculations come in between. Interestingly, $a .= $b; is much
faster then $a = $a.$b; if used occasionally (i.e. lots of instructions
in the loop). Both are as fast if used in a tight loop (which Perl
likely optimizes).
My speed is reasonable now, using .= I go through data at ~600kBytes/sec
without any unexpected CPU load peaks. A faster CPU would help (this
runs on a 500Mhz P3 now).
I will put this question on a Perlmongers list when I have time.
thanks, the list really helped,
Peter
--
The Toronto Linux Users Group. Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
More information about the Legacy
mailing list