Re: copy vs. C function

From: rajmhn <rajmhn(dot)ram(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: copy vs. C function
Date: 2016-12-30 11:35:52
Message-ID: 1483097752044-5936796.post@n3.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Jon

This is exactly, what I was looking for. Need to read the data from
delimited file with no header, and do few transformation as described below
using Postgres C function and load it using pg_bulkload utility.

Transformation below, can be handled with query after loading all the data
as varchar and nullable. But we need to handle this before loading as like
we do in Oracle. I'm converting the code from Oracle to Postgres. Both
version of code(Oracle & Postgres) will be available for different users.

In Oracle, doing these kind of transformation in SQL loader. Need to follow
the same kind of approach in Postgres. SQL filter approach was very easy in
terms of coding. From documentation found, C filter was very much faster
than SQL.

I'm very much new to C. Looking for your options as you mentioned here in
post.Some standard syntax for writing these functionalities would be greatly
helpful. Kindly help me.

Sample Data:

ABC|20170101|DEF ||GHIJ|KLM

Target Table Definition:

COLA numeric(5,0)

COLB date

COLC text

COLD text

COLE text

First column should be mapped to COLA

Second column should be mapped to COLB

Third column should be mapped to COLD

Fourth column should be mapped to COLC

Fifth column should be mapped to Some default value(column is not
present in source)

Transformation:

a)First column should be mapped to COLA. It is numeric in target table.
If any alpha-characters were present, default this column with '0'.
Otherwise, source value should be moved to table.

b)Second column should be mapped to COLB. TO_DATE function from text
format. File will have date format as YYYYMMDD. It should be converted to
date.

c)Third column should be mapped to COLD.Need to Trim both leading and
trailing spaces.

d)Fourth column should be mapped to COLC. If it NULL, some value should
be defaulted.

e)Only few columns from source file should be loaded. In this case, only
first four columns should be loaded.

f)Different ordering in source files & target columns.In this case,

Third column should be mapped to COLD

Fourth column should be mapped to COLC

g)COLE should be loaded with default value. This column is not present
in source file.

Thanks

--
View this message in context: http://postgresql.nabble.com/copy-vs-C-function-tp5065298p5936796.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Андрей Хозов 2017-01-02 14:34:49 Performance issue with castings args of the function
Previous Message Daniel Blanch Bataller 2016-12-28 13:11:33 Re: Slow query after 9.3 to 9.6 migration