Postgres/Perl performance help needed

Madison Kelly linux-5ZoueyuiTZhBDgjK7y7TUQ at public.gmane.org
Tue May 25 20:16:23 UTC 2004


Hi all,

   The intrepid "Asker of questions" is back (so sorry!).

   I am hoping I can pick the brains of those of you here who play with 
postgresql a lot. Maybe you can help me configure the server to be more 
efficient or suggest a different way to approach my code.

   I have run into a problem where I need to write data into a 
postgresql database and it is unbearably slow to do the task. I have 
been reading (and will continue to read after sending this) what I can 
find on the postgresql.org site and elsewhere that google points me but 
so far my attempts to resolve the problem have in fact made it worse.

   Specs: I have Fedora Core 1 on a Pentium3 650MHz CPU with 448MB RAM 
(it's a box make of parts) running in runlevel 3 with no X up.

   I have tried increasing the amount of swap space by passing:

echo 128000000 > /proc/sys/kernel/shmmax

   to the kernel and editing /var /lib/pgsql/data/postgresql.cong to have:

shared_buffers = 15200
sort_mem = 32168

   Before I started messing with things I record a directory with 2,490 
files and folders (just the names, obviously!) in 23 seconds which was 
not reasonable. When I tried to record a filesystem with 175,000 records 
it took 32 minutes... Since I have started tweaking the same number of 
records takes 35 seconds.

   What the script does during this time is:
- Read the contents of an 'ls' call into an array
- Process each entry in that array by spliting out the data in string 
variables
- Look at the permission to see if it is a directory or a file.
- If it is a file, check within the database to see if the record exists
   - If it exists, update it
   - If it does not exist, insert it
- If it is a direcotry first check it against an array of ignored 
directories
   - If the directory isn't to ignored:
     - Check to see if the directory already exists as a record
       - If it does, update it
       - if it does not, insert it
       - Read the contents of the subdirectory using the same steps here.


   Thanks everyone!!

Madison

   Here is the code snipet:

=- Begin perl code snippet -=

$starttime=time;
$real_dir=$mntdir;
# Set the 'file_exist' flag to 'false' and reset exiting files to 'true'.
$DB->do("UPDATE file_dir SET file_exist='f' WHERE 
file_src_uuid='$file_src_uuid'") || die "$DBI::errstr";
# Jump to our file list sub routine
&list_files($real_dir, $exclude_list_num, $relative_dir);
$endtime=time;
$totaltime=($endtime - $starttime);

sub list_files
{
	my $real_dir = shift;
	my $exclude_list_num = shift;
	my $relative_dir = shift;
	print " \n";
	print " =- Working in directory; Real Dir: '$real_dir', Relative Dir: 
'$relative_dir' -=\n";
	print "Permission\t Num\t Owner\t Group\t Size\t Mod Date\t Mod 
Time\t\t GMT\t File Name\n";
	open (LS, "sudo ls -lA --full-time '$real_dir' 2>&1 |");
	my @file_list;
	while (<LS>)
	{
		next if /^total \d+/;
		s/\n//;
		push ( @file_list, $_ );
	}
	close(LS);
	for ( @file_list )
	{
		my ($file_perm, $num, $file_own_user, $file_own_grp, $file_size, 
$file_mod_date, $file_mod_time, $file_mod_time_zone, 
$file_name)=split/\s+/, $_, 9;
		next if ( $file_perm eq "total" );
		if ( $file_perm =~ /^d/ )
		{
			$baddir=0;
			for ($j=0; $j <= $exclude_list_num; $j++)
			{
				my $dircheck;
				if ( $real_dir eq "/" )
				{
					$dircheck=$real_dir.$file_name;
				}
				else
				{
					$dircheck=$real_dir."/".$file_name;
				}
				if ( $exclude_list[$j] eq $dircheck )
				{
					$baddir=1;
				}
			}
			if ( $baddir == 0 )
			{
				if ( $real_dir eq "/" )
				{
					$passed_real_dir=$real_dir.$file_name;
				}
				else
				{
					$passed_real_dir=$real_dir."/".$file_name;
				}
				$file_parent_dir=$relative_dir;
				if ( $relative_dir eq "/" )
				{
					$passed_relative_dir=$relative_dir.$file_name;
				}
				else
				{
					$passed_relative_dir=$relative_dir."/".$file_name;
				}
				$i++;
				print "$file_perm\t $file_own_user\t $file_own_grp\t $file_size\t 
$file_mod_date\t $file_mod_time\t $file_mod_time_zone\t $file_name\n";
				$DBreq=$DB->prepare("SELECT null FROM file_dir WHERE 
file_src_uuid='$file_src_uuid' AND file_parent_dir='$relative_dir' AND 
file_name='$file_name'") || die "$DBI::errstr";
				$num=$DBreq->execute();
				if ( $num > 0 )
				{
  					$DB->do("UPDATE file_dir SET file_perm='$file_perm', 
file_own_user='$file_own_user', file_own_grp='$file_own_grp', 
file_size=$file_size, file_mod_date='$file_mod_date', 
file_mod_time='$file_mod_time', 
file_mod_time_zone='$file_mod_time_zone', file_exist='t' WHERE 
file_src_uuid='$file_src_uuid' AND file_parent_dir='$relative_dir' AND 
file_name='$file_name'") || die "$DBI::errstr";
				}
				else
				{
					$DB->do("INSERT INTO file_dir ( file_src_uuid, file_name, file_dir, 
file_parent_dir, file_perm, file_own_user, file_own_grp, file_size, 
file_mod_date, file_mod_time, file_mod_time_zone, file_backup, 
file_display, file_exist ) VALUES ( '$file_src_uuid', '$file_name', 't', 
'$file_parent_dir', '$file_perm', '$file_own_user', '$file_own_grp', 
$file_size, '$file_mod_date', '$file_mod_time', '$file_mod_time_zone', 
0, 0, 't' )") || die "$DBI::errstr";
				}
				&list_files($passed_real_dir, $exclude_list_num, $passed_relative_dir);
			}
		}
		elsif ( $file_perm =~ /^l/ )
		{
			# MADDY: Add the option to let the user decide whether to follow 
symlinks or not then catch it here.
			$file_dir="f";
			$file_parent_dir=$relative_dir;
			$i++;
			print "$file_perm\t $num\t $file_own_user\t $file_own_grp\t 
$file_size\t $file_mod_date\t $file_mod_time\t $file_mod_time_zone\t 
$file_name\n";
			$DBreq=$DB->prepare("SELECT null FROM file_dir WHERE 
file_src_uuid='$file_src_uuid' AND file_parent_dir='$relative_dir' AND 
file_name='$file_name'") || die "$DBI::errstr";
			$num=$DBreq->execute();
			if ( $num > 0 )
			{
				$DB->do("UPDATE file_dir SET file_perm='$file_perm', 
file_own_user='$file_own_user', file_own_grp='$file_own_grp', 
file_size=$file_size, file_mod_date='$file_mod_date', 
file_mod_time='$file_mod_time', 
file_mod_time_zone='$file_mod_time_zone', file_exist='t' WHERE 
file_src_uuid='$file_src_uuid' AND file_parent_dir='$relative_dir' AND 
file_name='$file_name'") || die "$DBI::errstr";
			}
			else
			{
				$DB->do("INSERT INTO file_dir ( file_src_uuid, file_name, file_dir, 
file_parent_dir, file_perm, file_own_user, file_own_grp, file_size, 
file_mod_date, file_mod_time, file_mod_time_zone, file_backup, 
file_display, file_exist ) VALUES ( '$file_src_uuid', '$file_name', 'f', 
'$file_parent_dir', '$file_perm', '$file_own_user', '$file_own_grp', 
$file_size, '$file_mod_date', '$file_mod_time', '$file_mod_time_zone', 
0, 0, 't' )") || die "$DBI::errstr";
			}
		}
		elsif ( $file_perm =~ /^-/ )
		{
			$file_dir="f";
			$file_parent_dir=$relative_dir;
			$i++;
			print "$file_perm\t $num\t $file_own_user\t $file_own_grp\t 
$file_size\t $file_mod_date\t $file_mod_time\t $file_mod_time_zone\t 
$file_name\n";
			$DBreq=$DB->prepare("SELECT null FROM file_dir WHERE 
file_src_uuid='$file_src_uuid' AND file_parent_dir='$relative_dir' AND 
file_name='$file_name'") || die "$DBI::errstr";
			$num=$DBreq->execute();
			if ( $num > 0 )
			{
				$DB->do("UPDATE file_dir SET file_perm='$file_perm', 
file_own_user='$file_own_user', file_own_grp='$file_own_grp', 
file_size=$file_size, file_mod_date='$file_mod_date', 
file_mod_time='$file_mod_time', 
file_mod_time_zone='$file_mod_time_zone', file_exist='t' WHERE 
file_src_uuid='$file_src_uuid' AND file_parent_dir='$relative_dir' AND 
file_name='$file_name'") || die "$DBI::errstr";
			}
			else
			{
				$DB->do("INSERT INTO file_dir ( file_src_uuid, file_name, file_dir, 
file_parent_dir, file_perm, file_own_user, file_own_grp, file_size, 
file_mod_date, file_mod_time, file_mod_time_zone, file_backup, 
file_display, file_exist ) VALUES ( '$file_src_uuid', '$file_name', 'f', 
'$file_parent_dir', '$file_perm', '$file_own_user', '$file_own_grp', 
$file_size, '$file_mod_date', '$file_mod_time', '$file_mod_time_zone', 
0, 0, 't' )") || die "$DBI::errstr";
			}
		}
		else
		{
			system 'echo " | |- \''.$file_name.'\' is not a known file type, 
skipping." >> '.$log;
		}
	}
	print " =- Return from; Real Dir: '$real_dir', Relative Dir: 
'$relative_dir':\n";
	print " \n";
	return;
}

=- End perl code snippet -=

--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list