From: | "Luke Lonergan" <LLonergan(at)greenplum(dot)com> |
---|---|
To: | "David Lang" <dlang(at)invendra(dot)net> |
Cc: | "Steve Oualline" <soualline(at)stbernard(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Database restore speed |
Date: | 2005-12-02 08:06:43 |
Message-ID: | 3E37B936B592014B978C4415F90D662D01C48F01@MI8NYCMAIL06.Mi8.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
David,
> Luke, would it help to have one machine read the file and
> have it connect to postgres on a different machine when doing
> the copy? (I'm thinking that the first machine may be able to
> do a lot of the parseing and conversion, leaving the second
> machine to just worry about doing the writes)
Unfortunately not - the parsing / conversion core is in the backend,
where it should be IMO because of the need to do the attribute
conversion there in the machine-native representation of the attributes
(int4, float, etc) in addition to having the backend convert from client
encoding (like LATIN1) to the backend encoding (like UNICODE aka UTF8).
There are a few areas of discussion about continued performance
increases in the codebase for COPY FROM, here are my picks:
- More micro-optimization of the parsing and att conversion core - maybe
100% speedup in the parse/convert stage is possible
- A user selectable option to bypass transaction logging, similar to
Oracle's
- A well-defined binary input format, like Oracle's SQL*Loader - this
would bypass most parsing / att conversion
- A direct-to-table storage loader facility - this would probably be the
fastest possible load rate
- Luke
From | Date | Subject | |
---|---|---|---|
Next Message | Ron | 2005-12-02 08:15:00 | Re: 15,000 tables |
Previous Message | David Lang | 2005-12-02 08:04:36 | Re: filesystem performance with lots of files |