pg_dump - sort data?

From: Michael Swierczek <mike(dot)swierczek(at)gmail(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: pg_dump - sort data?
Date: 2006-02-02 15:19:36
Message-ID: 68b5b5880602020719x2bccc8cx737e33c114f7946@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Ladies and Gentlemen:
We have several Postgres 7.4.x and 8.x databases running on different
machines which are not able to connect directly over IP (including VPN). So
periodically, data from tables is extracted in .csv format, zipped,
encrypted, and transferred to other locations. I have a set of Java JDBC
apps that perform the integration of the extracted data. At this point in
time, the largest database creates a 25MB file from pg_dump.

When I am adding new features to the JDBC apps, I need to run tests
periodically to ensure that the integration worked completely. For very
small test data sets, this is easy - just query the test database afterwards
to make sure the new data is in place. For larger test data sets with
hundreds or thousands of inserts, it's not feasible.

What I have been doing is running pg_dump on the test database, running my
application with a large test data set, running pg_dump again, and trying to
compare the result. The only problem is, pg_dump does not seem to sort the
contents of the "Copy <table> ... " sections. So the before and after files
can have the exact same data in totally different order. What I do is sort
the contents of the before and after dump files using Gnu sort on cygwin.
That puts the contents in alphanumeric order, but naturally it does not keep
them seperate by table.

So my questions are:
1. Does anyone see an easier way for me to test what I have been doing?
2. Is there a way to make pg_dump dump the whole database and then sort the
results by table?
3. If not 1 & 2, it looks like my best solution is a cygwin shell script,
windows batch file, or executable that automates dumping the 10 tables I
need to 10 different files and then sorting each file individually. Does
that sound correct? Are there any existing Postgres tools or add-on tools
that could do this for me?

Thanks very much,
Mike

PS While I'm on the list, I have a somewhat off-topic question. Our
product automatically generates PDF reports from sets of data using Crystal
Reports. Other than the Crystal Reports software, everything we use can be
run on Linux. Is anyone aware of a Linux-compatible alternative to Crystal
Reports? Ideally we would like to use open source, but even a proprietary
system that ran on Linux would be acceptable.

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Richard Broersma Jr 2006-02-02 18:38:50 Generalization Hierarchies & Table Inheritance
Previous Message Tom Lane 2006-02-02 14:57:12 Re: Can one turn off constraints/indices temporarily?