Quick Links

Re: Very big insert/join performance problem (bacula)

From:	Richard Huxton <dev(at)archonet(dot)com>
To:	Marc Cousin <mcousin(at)sigma(dot)fr>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Very big insert/join performance problem (bacula)
Date:	2009-07-14 08:23:25
Message-ID:	4A5C407D.2070808@archonet.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Marc Cousin wrote:
>
> Temporarily I moved the problem at a bit higher sizes of batch by changing
> random_page_cost to 0.02 and seq_page_cost to 0.01, but I feel like an
> apprentice sorcerer with this, as I told postgreSQL that fetching rows from
> disk are much cheaper than they are. These values are, I think, completely
> abnormal.

They certainly don't have anything to do with reality. Try putting them
back to (say) seq_page_cost=1 and random_page_cost=2.

> So, finally, to my questions :
> - Is it normal that PostgreSQL is this off base on these queries (sorry I
> don't have the plans, if they are required I'll do my best to get some, but
> they really are the two obvious plans for this kind of query). What could
> make it choose the hash join for too small batch tables ?

No point in speculating without plans.

> - Is changing the 2 costs the way to go ?

Not the way you have.

> - Is there a way to tell postgreSQL that it's more costly to sort than it
> thinks ? (instead of telling it that fetching data from disk doesn't cost
> anything).

That's what the configuration settings do. But if you put a couple way
off from reality it'll be pure chance if it gets any estimates right.

> Here are the other non-default values from my configuration :
>
> shared_buffers = 2GB
> work_mem = 64MB

Set this *much* higher when you are running your bulk imports. You can
do it per-connection. Try 256MB, 512MB, 1GB (but keep an eye on total
memory used).

> maintenance_work_mem = 256MB
> max_fsm_pages = 15000000 # There are quite big deletes with bacula ...
> effective_cache_size = 800MB

See other emails on this one.

> default_statistics_target = 1000

Probably don't need this for all columns, but it won't cause problems
with these queries.

--
Richard Huxton
Archonet Ltd

In response to

Very big insert/join performance problem (bacula) at 2009-07-13 13:40:18 from Marc Cousin

Responses

Re: Very big insert/join performance problem (bacula) at 2009-07-14 09:22:16 from Marc Cousin
Re: Very big insert/join performance problem (bacula) at 2009-07-15 11:37:01 from Marc Cousin

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Marc Cousin	2009-07-14 09:16:01	Re: Very big insert/join performance problem (bacula)
Previous Message	Richard Huxton	2009-07-14 08:15:21	Re: Very big insert/join performance problem (bacula)