From: | Mary Anderson <maryfran(at)demog(dot)berkeley(dot)edu> |
---|---|
To: | pgsql-admin(at)postgresql(dot)org |
Subject: | am i creating a performance bottleneck? |
Date: | 2007-07-18 00:31:11 |
Message-ID: | 469D5F4F.10307@demog.berkeley.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Hi all,
In order to save a fair amount of space I am creating a schema
against which I will want to run the following simplified query. Think
of the size of the tables as data ~ 30 M rows, series has 10,00
entries, data_has_dimensions has 300 K rows, dimensions is tiny
select population = da.value,
residence = di.value
from data da, data_has_dimension dhd, dimensions di, series si
where da.series_id = si.series_id
and si.series = 'my-series'
and dhd.data_id = da.data_id
and dhd.dimension_id = di.dimension_id
and di.dimension = 'residence'
UNION
select population = da.value,
residence = 'total'
where da.series_id = si.series_id
and si.series = 'my_series'
and NOT EXISTS(select dhd.da_id
from data_has_dimensions dhd, dimensions di
where di.dimension = 'residence'
and dhd.dimension_id = di.dimension_id
and dhd.da_id = da.da_id)
I am most worried about the second select, with its 'NOT EXISTS'
statement slowing everything down. I would put appropriate indexes on
this -- namely an index on series for data and an index on dhd for
da_id. My user community is a bunch of academics, so I am not under
the performance constraints I would have for a business application.
Would it help performance if I denormalized the database by attaching
series to the data_has_dimensions table?
Thanks,
Mary Anderson
From | Date | Subject | |
---|---|---|---|
Next Message | plabrh1 | 2007-07-18 00:43:33 | Re: Several postgres installation on windows possible? |
Previous Message | Guillaume Lelarge | 2007-07-17 23:01:40 | Re: plpgsql debugger |