Quick Links

Tuning queries on large database

From:	Valerie Schneider DSI/DEV <Valerie(dot)Schneider(at)meteo(dot)fr>
To:	pgsql-performance(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org
Cc:	Valerie(dot)Schneider(at)meteo(dot)fr
Subject:	Tuning queries on large database
Date:	2004-08-04 12:44:43
Message-ID:	200408041244.i74CihO19344@mu.meteo.fr
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-performance

Hi,

I have some problem of performance on a PG database, and I don't
know how to improve. I Have two questions : one about the storage
of data, one about tuning queries. If possible !

My job is to compare Oracle and Postgres. All our operational databases
have been running under Oracle for about fifteen years. Now I try to replace
Oracle by Postgres.

I have a test platform under linux (Dell server, 4 Gb RAM, bi-processor,
Linux Red Hat 9 (2.4.20-31.9)) with 2 databases, 1 with Oracle
(V8i or V9i it's quite the same), 1 with PG (7.4.2). Both databases
have the same structure, same content, about 100 Gb each. I developped
some benches, representative of our use of databases. My problem
is that I have tables (relations) with more than 100 millions rows,
and each row has about 160 fields and an average size 256 bytes.

For Oracle I have a SGA size of 500 Mb.
For PG I have a postgresql.conf as :
max_connections = 1500
shared_buffers = 30000
sort_mem = 50000
effective_cache_size = 200000
and default value for other parameters.

It contains 1000 different values of "num_poste" and for each one
125000 different values of "dat" (1 row per hour, 15 years).

I run a vacuum analyze of the table.

My first remark is that the table takes a lot of place on disk, about
70 Gb, instead of 35 Gb with oracle.
125 000 000 rows x 256 b = about 32 Gb. This calculation gives an idea
not so bad for oracle. What about for PG ? How data is stored ?

The different queries of the bench are "simple" queries (no join,
sub-query, ...) and are using indexes (I "explained" each one to
be sure) :
Q1 select_court : access to about 700 rows : 1 "num_poste" and 1 month
(using PK : num_poste=p1 and dat between p2 and p3)
Q2 select_moy : access to about 7000 rows : 10 "num_poste" and 1 month
(using PK : num_poste between p1 and p1+10 and dat between p2 and p3)
Q3 select_long : about 250 000 rows : 2 "num_poste"
(using PK : num_poste in (p1,p1+2))
Q4 select_tres_long : about 3 millions rows : 25 "num_poste"
(using PK : num_poste between p1 and p1 + 25)

The result is that for "short queries" (Q1 and Q2) it runs in a few
seconds on both Oracle and PG. The difference becomes important with
Q3 : 8 seconds with oracle
80 sec with PG
and too much with Q4 : 28s with oracle
17m20s with PG !

Of course when I run 100 or 1000 parallel queries such as Q3 or Q4,
it becomes a disaster !
I can't understand these results. The way to execute queries is the
same I think. I've read recommended articles on the PG site.
I tried with a table containing 30 millions rows, results are similar.

What can I do ?

Thanks for your help !

********************************************************************
* Les points de vue exprimes sont strictement personnels et *
* n'engagent pas la responsabilite de METEO-FRANCE. *
********************************************************************
* Valerie SCHNEIDER Tel : +33 (0)5 61 07 81 91 *
* METEO-FRANCE / DSI/DEV Fax : +33 (0)5 61 07 81 09 *
* 42, avenue G. Coriolis Email : Valerie(dot)Schneider(at)meteo(dot)fr *
* 31057 TOULOUSE Cedex - FRANCE http://www.meteo.fr *
********************************************************************

Responses

Re: [PERFORM] Tuning queries on large database at 2004-08-04 13:21:51 from Christopher Kings-Lynne
Re: Tuning queries on large database at 2004-08-04 13:26:42 from Rod Taylor
Re: Tuning queries on large database at 2004-08-04 15:34:56 from Gaetano Mendola

Browse pgsql-general by date

	From	Date	Subject
Next Message	Merlin Moncure	2004-08-04 13:06:54	Re: [PERFORM] Tuning queries on large database
Previous Message	terry	2004-08-04 12:07:55	Re: trash talk

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Paul Serby	2004-08-04 12:45:55	The black art of postgresql.conf tweaking
Previous Message	Ulrich Wisser	2004-08-04 12:00:39	How to know which queries are to be optimised?