From: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
---|---|
To: | yudhi s <learnerdatabase99(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Moving delta data faster |
Date: | 2024-04-04 04:46:56 |
Message-ID: | b43139d3-8f8e-4635-a7cb-cab90e5205eb@aklaver.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 4/3/24 20:54, yudhi s wrote:
> On Thu, Apr 4, 2024 at 2:41 AM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
> <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>
> On 4/3/24 13:38, yudhi s wrote:
> > Hi All,
> > It's postgresql database version 15.4. We have a requirement in
> which
> > we will be initially moving full table data for 3-4 tables, from
> source
> > database to target(i.e. postgres) . Maximum number of rows will be
> > ~10million rows in those tables. Then subsequently these rows
> will be
> > inserted/updated based on the delta number of rows that got
> > inserted/updated in the source database. In some cases these changed
> > data can flow multiple times per day to the downstream i.e. postgres
> > database and in other cases once daily.
>
> What is the source database?
>
> Can it be reached with a FDW?:
>
> https://wiki.postgresql.org/wiki/Foreign_data_wrappers
> <https://wiki.postgresql.org/wiki/Foreign_data_wrappers>
>
> Can the delta on the source be output as CSV?
>
>
>
> Thank you Adrian.
>
> And one thing i forgot to mention this target postgresql database would
> be on AWS RDS whereas the source Oracle databases is on premise. I think
> we don't have the FDW extension currently in place but we can get that.
> I am just not able to understand clearly though, but do you mean export
> the data from source using CSV and do truncate and import on target. And
> as these data will be moved through the network won't that cause slowness?
>
> The source database here is Oracle database. Correct me if wrong, it
> looks like foreign data wrapper is like a DB link. Or do you mean
> writing a query on the target database (which can be UPSERT or MERGE)
> but will be joining the table from the source database through the
> DBlink/DDW? But my question was whether we should use UPSERT or MERGE
> for comparing and loading the delta records to the target postgresql
> database. Want to understand which is more performant , as I see in the
> past Merge having performance issues in the past, but not very sure
> about that.
My motivation was to get some basic information about your setup and
what you are trying to achieve.
If I understand correctly you have:
1) An Oracle database with tables that you want to copy the complete
data from to a Postgres database. For this sort of thing
COPY(https://www.postgresql.org/docs/current/sql-copy.html) on the
Postgres end using CSV data generated from the source is probably the
quickest bulk load method.
2) After the initial load you want to do follow up INSERT/UPDATEs based
on a delta of the source tables relative to the initial load. This is
still a bit of mystery to me. How are determining the delta: a) On the
source end entirely or b) Target relative to source? Also what is the
anticipated size of the delta per transfer?
Additional information needed:
1) Network distance between source and target?
2) Network capacity?
3) Expected load on both source and target servers from other operations?
--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com
From | Date | Subject | |
---|---|---|---|
Next Message | Adnan Dautovic | 2024-04-04 05:23:34 | Failure of postgres_fdw because of TimeZone setting |
Previous Message | yudhi s | 2024-04-04 03:54:05 | Re: Moving delta data faster |