Re: query performance question

From: tv(at)fuzzy(dot)cz
To: "Marcin Citowicki" <marcin(dot)citowicki(at)m4n(dot)nl>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: query performance question
Date: 2008-06-03 09:04:34
Message-ID: 31028.217.77.161.17.1212483874.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

Hubert already answered your question - it's expected behavior, the
count(*) has to read all the tuples from the table (even dead ones!). So
if you have a really huge table, it will take a long time to read it.

There are several ways to speed it up - some of them are simple (but the
speedup is limited), some of them require change of application logic and
requires to rewrite part of the application (using triggers to count the
rows, etc.)

1) If the transactions have sequential ID without gaps, you may easily
select MAX(id) and that'll give the count. This won't work if some of the
transactions were deleted or if you need to use other filtering criteria.
The needed changes in the application are quite small (basically just a
single SQL query).

2) Move the table to a separate tablespace (a separate disk if possible).
This will speed up the reads, as the table will be 'compact'. This is just
a db change, it does not require change in the application logic. This
will give you some speedup, but not as good as 1) or 3).

3) Build a table with totals or maybe subtotals, updated by triggers. This
requires serious changes in application as well as in database, but solves
issues of 1) and may give you even better results.

Tomas

> Hello,
>
> I have a table (transactions) containing 61 414 503 rows. The basic
> count query (select count(transid) from transactions) takes 138226
> milliseconds.
> This is the query analysis output:
>
> Aggregate (cost=2523970.79..2523970.80 rows=1 width=8) (actual
> time=268964.088..268964.090 rows=1 loops=1);
> -> Seq Scan on transactions (cost=0.00..2370433.43 rows=61414943
> width=8) (actual time=13.886..151776.860 rows=61414503 loops=1);
> Total runtime: 268973.248 ms;
>
> Query has several indexes defined, including one on transid column:
>
> non-unique;index-qualifier;index-name;type;ordinal-position;column-name;asc-or-desc;cardinality;pages;filter-condition
>
> f;<null>;transactions_id_key;3;1;transid;<null>;61414488;168877;<null>;
> t;<null>;trans_ip_address_index;3;1;ip_address;<null>;61414488;168598;<null>;
> t;<null>;trans_member_id_index;3;1;member_id;<null>;61414488;169058;<null>;
> t;<null>;trans_payment_id_index;3;1;payment_id;<null>;61414488;168998;<null>;
> t;<null>;trans_status_index;3;1;status;<null>;61414488;169005;<null>;
> t;<null>;transactions__time_idx;3;1;time;<null>;61414488;168877;<null>;
> t;<null>;transactions_offer_id_idx;3;1;offer_id;<null>;61414488;169017;<null>;
>
> I'm not a dba so I'm not sure if the time it takes to execute this query
> is OK or not, it just seems a bit long to me.
> I'd appreciate it if someone could share his/her thoughts on this. Is
> there a way to make this table/query perform better?
> Any query I'm running that joins with transactions table takes forever
> to complete, but maybe this is normal for a table this size.
> Regards,
>
> Marcin
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message andrew klassen 2008-06-03 22:36:09 insert/update tps slow with indices on table > 1M rows
Previous Message Marcin Citowicki 2008-06-03 08:55:22 Re: query performance question