Re: Getting Mysql data into Postgres: least painful methods?

From: Ken Tanzer <ken(dot)tanzer(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Getting Mysql data into Postgres: least painful methods?
Date: 2013-01-16 00:09:52
Message-ID: CAD3a31VFJ7-wU0sTTDUg5yjo9vEULhFGDq40+NB092=QhS7a+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks for all the responses. I would totally go with the FDW, were I not
stuck using (someone else's) 8.4.

In this case it looks like 24 tables, with CSV-like import files totaling
7G.

Since there didn't seem to be a clean, simple and automated path from mysql
to postgres, I'm back to skipping mysql entirely and just trying to modify
the mysql files to feed directly into postgres.

To that end, they have to be transformed a bit, which I've written a bit of
script to accomplish. I'm wondering if there's a way to avoid creating
another 7G of slightly-modified import files before feeding them to
postgres. Specifically, is there a way to do something like

\copy my_table FROM '`cat my_import_file | my_transform_script`'

My 2 goals here are to be somewhat efficient (by not duplicating the input
files), and to keep this all within a transaction. I could have the script
transform each file separately and pipe it to postgres:

(echo 'copy mytable from stdin...' ; cat my_import_file |
my_transform_script ) | psql

but I'm thinking that there's no way to group those all into a transaction.

Hopefully this makes sense, and any suggestions welcome. Thanks.

Ken

On Fri, Jan 11, 2013 at 7:13 PM, John R Pierce <pierce(at)hogranch(dot)com> wrote:

> On 1/11/2013 3:54 PM, Ken Tanzer wrote:
>
> Here's the fuller description of what I'm trying to do. I've got a
> dataset (a UMLS* *Metathesaurus subset) that I need to get into a
> Postgres database. It's all reference data, and so will be read-only.
> There's no functions or logic involved. I anticipate having to update it at
> least quarterly, so I'd like to get to a well-grooved import process.
>
>
>
> how many tables? if its just one or a couple tables, can you get the data
> as CSV? then it would be trivial to import into postgres, using the COPY
> command (or, \c from psql)...
>
> another alternative, investigate "ETL" tools, these are general purpose
> data manglers that can connect to a source database (usually any of about
> 20 supported), extract data, transform it if needed, and load it into a
> destination database (from a list of 20 or so typically supported)
>
>
>
>

--
AGENCY Software
A data system that puts you in control
*http://agency-software.org/*
ken(dot)tanzer(at)agency-software(dot)org
(253) 245-3801

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2013-01-16 00:21:38 Re: Linux Distribution Preferences?
Previous Message Edson Richter 2013-01-15 23:51:33 Re: Enhancement proposal - detect chain of triggers from inside the trigger