Re: COPY and indices?

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: François Beausoleil <francois(at)teksol(dot)info>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: COPY and indices?
Date: 2012-03-14 19:06:27
Message-ID: CAHyXU0xw2isGK=WBWpw6mGypLrNe7hAWu3Ma4ce_4tockaSMHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2012/3/14 François Beausoleil <francois(at)teksol(dot)info>:
> Le mardi 13 mars 2012 à 11:15, Merlin Moncure a écrit :
>
>> 2012/3/13 François Beausoleil <francois(at)teksol(dot)info (mailto:francois(at)teksol(dot)info)>:
>> >
>> > I'll go with the COPY, since I can live with the batched requirements just fine.
>>
>> 30-40 'in transaction' i/o bound inserts is so slow as to not really
>> be believable unless each record is around 1 megabyte because being in
>> transaction removes storage latency from the equation. Even on a
>> crappy VM. As a point of comparison my sata workstation drive can do
>> in the 10s of thousands. How many records are you inserting per
>> transaction?
>>
>
>
> I took the time to gather statistics about the database server: https://gist.github.com/07bbf8a5b05b1c37a7f2
>
> The files are a series of roughly 30 second samples, while the system is under production usage. When I quoted 30-40 transactions per second, I was actually referring to the number of messages processed from my message queue. Going by the PostgreSQL numbers, xact_commit tells me I manage 288 commits per second. It's much better than I anticipated.
>
> Anyways, if anybody has comments on how I could increase throughput, I'd appreciate. My message queues are almost always backed up by 1M messages, and it's at least partially related to PostgreSQL: if the DB can write faster, I can manage my backlog better.
>
> I'm still planning on going with batch processing, but I need to do something ASAP to give me just a bit more throughput.

well your iowait numbers are through the roof which makes things
pretty simple from a diagnosis point of view: your storage is
overloaded. the only remedies are to try and make your queries more
efficient so that you are doing less writing, better use of
transactions, etc. but looking at the log it appears the low hanging
fruit is already grabbed (synchronous_commit=off, etc). so you have
to choose from a list of not very pleasant options:

*) fsync=off
*) tune the application
*) bring more/faster storage online. a single ssd would probably make
your problem disappear. in the vm world, hopefully you can at least
bring another volume online and move your wal to that.
*) HARDWARE.

In the entirety of my career, I have never found anything more
perplexing than the general reluctance to upgrade hardware to solve
hardware related performance bottlenecks. Virtualization is great
technology but is nowhere near good enough in my experience to handle
high transaction rate database severs. A 5000$ server will solve your
issue, and you'll spend that in two days scratching your head trying
to figure out the issue (irritating your customers all the while).

merlin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Steve Crawford 2012-03-14 19:16:46 Re: Upgrade questions
Previous Message John R Pierce 2012-03-14 19:04:56 Re: Upgrade questions