Re: "Ungroup" data for import into PostgreSQL

From: "George Weaver" <gweaver(at)cleartagsoftware(dot)com>
To: "Jim Nasby" <Jim(dot)Nasby(at)BlueTreble(dot)com>, "pgsql-general" <pgsql-general(at)postgresql(dot)org>
Subject: Re: "Ungroup" data for import into PostgreSQL
Date: 2015-02-03 14:14:31
Message-ID: 6DAA7E4194F04DF2A6252B11A7B3AC79@D420
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Sorry for the late reply...life interefered...

>From: Jim Nasby

>On 1/15/15 9:43 AM, George Weaver wrote:
>> Hi List,
>>
>> I need to import data from a large Excel spreadsheet into a PostgreSQL
>> table. I have a program that uses ODBC to connect to Excel and extract
>> data using SQL queries. The program then inserts the data into a
>> PostgreSQL table.
>>
>> The challenge with this particular spreadsheet is that it is arrayed
>> thus:
>>
>> Model No 1 Product Code 15
>> Serial No 1 No on Hand
>> Serial No 2 No on Hand
>> Serial No 3 No on Hand
>> Model No 4 Product Code 9
>> Serial No 12 No on Hand
>> Model No 5 Product Code 27
>> Serial No 6 No on Hand
>> Serial No 14 No on Hand
>>
>> etc.
>>
>> I need the data in PostgreSQL arrayed thus
>>
>> Model No 1 Product Code 15 Serial No 1 No on Hand
>> Model No 1 Product Code 15 Serial No 2 No on Hand
>> Model No 1 Product Code 15 Serial No 3 No on Hand
>> Model No 4 Product Code 9 Serial No 12 No on Hand
>> Model No 5 Product Code 27 Serial No 6 No on Hand
>> Model No 5 Product Code 27 Serial No 14 No on Hand
>>
>> I can import the data procedurely using plpgsql to match the individual
>> rows to the master for each row (import the raw data into a buffer table
>> in PostgreSQL and then looping through the rows in the buffer table and
>> checking to see when the Model No changes).

>Note that if you're doing that you better be putting the rownumber from
>excel into the table... result sets are NOT guaranteed to be in insert
>order!

Good point.

>> I'm wondering if there is a more elegant way to do this using straight
>> sql from Excel?

>Well, that's really an excel question, not a Postgres question...

>If you load the whole spreadsheet into a single table and have a way to
>differentiate between the different rows then you might be able to do
>something wih CTE's to relate a serial number to the product code. That
>might be faster than plpgsql.

I have gone the route of loading the whole file into a buffer table in
PostgreSQL and processing it from there. Since the only way I can relate
each row to the model number is by its relative row position, I have used a
plpgsql loop to "flatten" the data.

>You might also be able to do something creative with formulas in excel to
>copy the product code data to the serial # rows. You could then import the
>whole thing and re-normalize it.

I looked at this but decided to just do it in PostgreSQL - much more
powerful :-)

>There's probably some stuff you could do with VBA too. If you care about
>performance you don't want to execute SQL statements for each spreadsheet
>row.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message George Weaver 2015-02-03 14:18:03 Re: "Ungroup" data for import into PostgreSQL
Previous Message Ramesh T 2015-02-03 13:16:35 dbmsscheduler