From: | Shaun Thomas <bonesmoses(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Having some problems with concurrent COPY commands |
Date: | 2015-10-12 20:14:26 |
Message-ID: | CAB78C+D=4CyZ-33kLvvZMenQWtkNEsmvkMKRPSV3AG2edchgbA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Mon, Oct 12, 2015 at 1:28 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Any chance
> you could provide profiles of such a run?
This is as simple as I could make it reliably. With one copy running,
the thread finishes in about 1 second. With 2, it's 1.5s each, and
with all 4, it's a little over 3s for each according to the logs. I
have log_min_duration_statement set to 1000, so it's pretty obvious.
The scary part is that it's not even scaling linearly; performance is
actually getting *worse* with each subsequent thread.
Regarding performance, all of this fits in memory. The tables are only
100k rows with the COPY statement. The machine itself is 8 CPUs with
32GB of RAM, so it's not an issue of hardware. So far as I can tell,
it happens on every version I've tested on, from 9.2 to 9.4. I also
take back what I said about wal_level. Setting it to minimal does
nothing. Disabling archive_mode and setting max_wal_senders to 0 also
does nothing. With 4 concurrent processes, each takes 3 seconds, for a
total of 12 seconds to import 400k rows when it would take 4 seconds
to do sequentially. Sketchy.
COPY (
SELECT id, id % 100, id % 1000, now() - (id || 's')::INTERVAL
FROM generate_series(1, 100000) a(id)
) TO '/tmp/loadtest1.csv';
CREATE TABLE test_copy (
id SERIAL PRIMARY KEY,
location VARCHAR NOT NULL,
reading BIGINT NOT NULL,
reading_date TIMESTAMP NOT NULL
);
CREATE INDEX idx_test_copy_location ON test_copy (location);
CREATE INDEX idx_test_copy_date ON test_copy (reading_date);
CREATE TABLE test_copy2 (LIKE test_copy INCLUDING INDEXES);
CREATE TABLE test_copy3 (LIKE test_copy INCLUDING INDEXES);
CREATE TABLE test_copy4 (LIKE test_copy INCLUDING INDEXES);
psql -c "COPY test_copy FROM '/tmp/loadtest1.csv'" &>/dev/null &
psql -c "COPY test_copy2 FROM '/tmp/loadtest1.csv'" &>/dev/null &
psql -c "COPY test_copy3 FROM '/tmp/loadtest1.csv'" &>/dev/null &
psql -c "COPY test_copy4 FROM '/tmp/loadtest1.csv'" &>/dev/null &
--
Shaun Thomas
bonesmoses(at)gmail(dot)com
http://bonesmoses.org/
From | Date | Subject | |
---|---|---|---|
Next Message | Graeme B. Bell | 2015-10-13 08:47:09 | V8 optimisation (if you're using javascript in postgres) |
Previous Message | Jeff Janes | 2015-10-12 18:56:23 | Re: Having some problems with concurrent COPY commands |