Suggestion for optimization

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Subject: Suggestion for optimization
Date: 2002-04-05 19:04:13
Message-ID: D90A5A6C612A39408103E6ECDD77B82906F424@voyager.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

It would be nice if total table cardinality could be maintained live.
So (after the initial vacuum) we update the cardinality for each table
in the system table (or perhaps add an entry to the table itself).
There are two reasons why this is an important optimization. Firstly,
it is a psychological benefit for both benchmarks and customers when
doing a select count(*) from <tablename>. This is something that pops
up all the time in benchmarks and customers do it too, in order to get a
feel for speed. By storing the current number and incrementing for
every insert and decrementing for every delete, the count(*) case with
no where clause can return the value instantly.

The far more important reason is for optimizations. An accurate
cardinality figure can greatly enhance the optimizer's ability to
perform joins in the correct order.

An example of a SQL system that does this sort of thing is Microsoft
SQL*Server. If you have 100 million rows in a table and do:
SELECT COUNT(*) FROM table_name
it will return the correct number instantly. The same is true for
Oracle.

It might also be possible to keep an array in memory of:
typedef struct tag_cardinality_list {
char *table_name;
unsigned long cardinality;
} cardinality_list;

and keep the data updated there with simple interlocked exchange
operations. The list would be loaded on Postmaster startup and saved on
shutdown.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-04-05 19:13:26 Re: timeout implementation issues
Previous Message Jan Wieck 2002-04-05 18:53:56 Re: timeout implementation issues