From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Shaun Thomas <bonesmoses(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Having some problems with concurrent COPY commands |
Date: | 2015-10-13 09:32:38 |
Message-ID: | 561CCFB6.6040405@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 10/12/2015 11:14 PM, Shaun Thomas wrote:
> On Mon, Oct 12, 2015 at 1:28 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
>> Any chance
>> you could provide profiles of such a run?
>
> This is as simple as I could make it reliably. With one copy running,
> the thread finishes in about 1 second. With 2, it's 1.5s each, and
> with all 4, it's a little over 3s for each according to the logs. I
> have log_min_duration_statement set to 1000, so it's pretty obvious.
> The scary part is that it's not even scaling linearly; performance is
> actually getting *worse* with each subsequent thread.
>
> Regarding performance, all of this fits in memory. The tables are only
> 100k rows with the COPY statement. The machine itself is 8 CPUs with
> 32GB of RAM, so it's not an issue of hardware. So far as I can tell,
> it happens on every version I've tested on, from 9.2 to 9.4. I also
> take back what I said about wal_level. Setting it to minimal does
> nothing. Disabling archive_mode and setting max_wal_senders to 0 also
> does nothing. With 4 concurrent processes, each takes 3 seconds, for a
> total of 12 seconds to import 400k rows when it would take 4 seconds
> to do sequentially. Sketchy.
I was not able reproduce that behaviour on my laptop. I bumped the
number of rows in your script 100000, to make it run a bit longer.
Attached is the script I used. The total wallclock time the COPYs takes
on 9.4 is about 8 seconds for a single COPY, and 12 seconds for 4
concurrent COPYs. So it's not scaling as well as you might hope, but
it's certainly not worse-than-serial either, as you you're seeing.
If you're seeing this on 9.2 and 9.4 alike, this can't be related to the
XLogInsert scaling patch, although you might've found a case where that
patch didn't help where it should've. I ran "perf" to profile the test
case, and it looks like about 80% of the CPU time is spent in the b-tree
comparison function. That doesn't leave much scope for XLogInsert
scalability to matter one way or another.
I have no explanation for what you're seeing though. A bad spinlock
implementation perhaps? Anything special about the hardware at all? Can
you profile it on your system? Which collation?
- Heikki
Attachment | Content-Type | Size |
---|---|---|
launch4.sh | application/x-sh | 1002 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Shaun Thomas | 2015-10-13 14:14:01 | Re: Having some problems with concurrent COPY commands |
Previous Message | Graeme B. Bell | 2015-10-13 08:47:09 | V8 optimisation (if you're using javascript in postgres) |