Sending errors from psql to error file

Thu Aug 12 18:32:53 UTC 2004

On Thu, 2004-08-12 at 13:56, cbbrowne-HInyCGIudOg at public.gmane.org wrote:
> > Hey,
> > 
> > I am trying to migrate a client from one database to another.
> > Basically we designed a web application for them using PostgreSQL but
> > we have made many changes to the design of our application since
> > version 1.  Now they want to upgrade to our new version.  So basically
> > I have to pg_dump their current data and then import it into our new
> > schema.  Now, of course I realize that there are going to be errors.
> > But they have a lot and I mean a LOT of data.  I don't want to have to
> > sit there and watch the import go by and monitor every insert, I want
> > to run a command and then look in a file for any errors after the
> > import is complete.  I tried this command but it didn't work:
> > 
> > gunzip -c dataInserts.sql.gz | psql dbname -U username | grep "ERROR:*"
> > > import_errors
> > 
> > (obviously dbname and username have been change to protect the innocent
> > ;))
> > 
> > Any help is appreciated.
> 
> Looks like a questionable approach to me.
> 
> What I'd do (what I /have done/) is to set up a schema for the 'legacy
> data,' and take the following approach:
> 
> 1.  Extract, from the old system, the existing data, into an
> "extraction" schema.
> 
>   select [fields] into extraction.t1 from public.t1 where [probably take
>      all the data];
>   -- And do this for all the tables you want to convert
> 
> 2.  Dump "extraction" and load it into the new system
> 
>   pg_dump -n extraction old_database | psql new_database
> 
> 3.  Set up selects to take the data from the "extraction" tables and
>     stuff them into the new schema...
> 
>    select [fields] into public.t1 from extraction.t1;
> 
> I actually took it some steps further, where:
> 
>  a) The "extraction" used joins and such to make the data look as clean
>      as possible, getting rid of any internal pointer-like numbers...
> 
>  b) The "insertion" into the new system involved setting up
>     stored procedures to create the assorted objects, so that
>     I'd do things like:
> 
>     select create_domain(domain attributes) 
>       from extraction.internet_domains;
>     select create_something_else (its attributes) 
>       from extraction.something_else;
> 
> This approach has the unfortunate quality that as soon as you hit a
> "borked" record, it'll crash and burn.  But if you can address
> "cleansing" the data well, it'll load it all in one fell swoop :-).
> 
> And it keeps the old data handy nearby for reference, _untouched_.
> --
> output = ("cbbrowne" "@" "cbbrowne.com")
> http://www.ntlug.org/~cbbrowne/internet.html
> (THASSERT (PLANNER RG))
>                 -- Example of HACKER statement.

I don't understand what is wrong with my method??

I have an old db with a schema and data and a new schema with similar
tables, some exactly the same and some new and some with changed field
names and some data that has to remain intact and so on.

So I pg_dump the data from the olddb. Then I try to import the data and
put all errors into my error_file like so:
$ pg_dump oldDBname -D -a -U odlDBuser | gzip -9 > dataInserts.sql.gz
$ zcat dataInserts.sql.gz | psql -d dbname -U dbuser -f -
2>>import_errors

After the import I check my error file and I find the error:
psql:<stdin>:177249: ERROR:  column "dvd_software_version" of relation
"device_details" does not exist (this is an actual error that I have run
into).  

I look at line 177249 to get a better look.  Then I look at the new
schema for device_details and see that dvd_software_version is now
dvd_version.  I run ALTER TABLE device_details rename column
dvd_software_version TO dvd_version; on the olddb...then export the data
again and the insert should work. (Or I could do a search and replace in
vi to change dvd_software_version to dvd_version).  I actually have many
import errors and I will fix them one by one in this manor and then do
the export and import once all errors are fixed.

Basically, this is what I am doing right this second.  I am going
through the error log and seeing what errors I got and figuring out how
to fix them.  Then I will just run the script again.  What is
questionable about this??  I would really like to know.  I am not saying
my way is perfect, it is just this is production data and I don't want
to cost my company big $$$!!!

> And it keeps the old data handy nearby for reference, _untouched_.

The data is always kept untouched with pg_dump.

Later

-- 
Devin Whalen
Programmer
Synaptic Vision Inc
Phone-(416) 539-0801
Fax- (416) 539-8280
1179A King St. West
Toronto, Ontario
Suite 309 M6K 3C5
Home-(416) 653-3982
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml