Quick Links

Re: Slow Inserts on 1 table?

From:	"John D(dot) Burger" <john(at)mitre(dot)org>
To:	Dan Armbrust <daniel(dot)armbrust(dot)list(at)gmail(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Slow Inserts on 1 table?
Date:	2005-08-02 14:32:08
Message-ID:	77cc8ed5b6d86115fed50515c82055f2@mitre.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

> my guess is because analyze has not been run yet, so it thinks all of
> the tables are size 0. If I let it run for a while, then kill the
> load process, run Analyze, empty the tables, and then restart, things
> perform fine. But that is kind of a ridiculous sequence to have to
> use to load a database.

So automate it. After discovering exactly this behavior, I've
developed an idiom for load scripts where I (optionally) commit at some
linear interval, and (optionally) analyze at some exponential interval.
I presume this has been invented countless times, but here's my basic
idea in pseudo-code:

commitInterval = 1000
analyzeFactor = 2
whenToCommit = whenToAnalyze = commitInterval
nInserts = 0

loop over input data
if we decide to insert
insert
nInserts++
if whenToCommit < nInserts
commmit
whenToCommit += commitInterval
if whenToAnalyze < nInserts
analyze
whenToAnalyze *= 2
....

So (with these constants) we commit after 1000 total inserts, then
after 2000, 3000, etc. And we analyze after 1000 inserts, then after
2000, 4000, etc. This is perhaps way too conservative - in particular,
I suspect that it's only the first one or two analyzes that matter -
but it works for me.

The only annoyance is that the interface I use most often, Python's
pgdb, runs everything in a transaction, and you can't analyze in a
transaction. I've gotten around this in a variety of ways, some less
principled than others.

- John D. Burger
MITRE

In response to

Re: Slow Inserts on 1 table? at 2005-08-02 13:38:25 from Dan Armbrust

Responses

Re: Slow Inserts on 1 table? at 2005-08-02 16:53:26 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	Victor	2005-08-02 14:32:37	Cost problem
Previous Message	Richard Huxton	2005-08-02 14:18:48	Re: Slow Inserts on 1 table?