Quick Links

Re: Can the query planner create indexes?

From:	tv(at)fuzzy(dot)cz
To:	"Jeremy Harris" <jgh(at)wizmail(dot)org>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Can the query planner create indexes?
Date:	2010-12-21 14:26:29
Message-ID:	383aecfc3802d0da989aac068619387e.squirrel@sq.gransy.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

> On 2010-12-21 10:42, Massa, Harald Armin wrote:
>> b) creating an index requires to read the data-to-be-indexed. So, to
>> have an
>> index pointing at the interesting rows for your query, the table has to
>> be
>> read ... which would be the perfect time to allready select the
>> interesting
>> rows. And after having the interesting rows: the index is worthless
>
> ... until another similar query comes along, when suddenly it's a massive
> win.
> Why not auto-create indices for some limited period after database load
> (copy? any large number of inserts from a single connection?), track
> those
> that actually get re-used and remove the rest? Would this not provide
> a better out-of-the-box experience for neophytes?

Say you have a table with several columns (A,B,C), and the query is using
some of them. What indexes would you create? One index on every column? A
multi-column index on all columns? Indexs for each combination of columns?

There really is no automatic way to solve this puzzle using a single
query. Indexing strategy is a very tough design discipline, and it
requires a complex knowledge of the workload. One slow query does not mean
the index should be created - what if that was just an ad-hoc query and
will not be executed ever again?

So although this 'automatic index creation' seems nice, it really does not
work in practice.

I really don't want to work with products that try to be smarter than me
(and I admit I'm pretty dumb from time to time) and act rather randomly
due to this 'artificial intelligence'. I've already been there and I don't
want to repeat this experience.

>
> [...]
>> Why is the query planner not allowed to create indexes, but only allowed
>> to
>>> use or not use what's available?
>>>
>>
>> as in b): Creating an index is quite expensiv
>
> How much more so than doing that full-table-scan plus sort, which your
> query is doing anyway?

A lot. It consumes a lot of CPU, it may consume a lot of memory (up to
maintenance_work_mem) etc. So imagine a few users, building indices on a
big table simultaneously - no, I really don't want to see this on a
production server.

Building an index is just one side of the problem - maintenance of the
indexes is another thing. Each index has an impact on write operations
(INSERT/UPDATE) and may cause that HOT actually does not work (thus
causing unnecessary bloat of the index). And the indexes may need lot of
space on a disk.

But the real show stopper is probably locking. Building an index takes a
write lock on a table, effectively blocking writes. Sure, you can use a
'CREATE INDEX ... CONCURRENTLY' introduced in 8.2, but there are some
disadvantages of that (see
http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY)

cheers
Tomas

In response to

Re: Can the query planner create indexes? at 2010-12-21 13:44:45 from Jeremy Harris

Responses

Re: Can the query planner create indexes? at 2010-12-21 15:13:29 from Grzegorz Jaśkiewicz
Re: Can the query planner create indexes? at 2010-12-21 15:34:38 from Jeremy Harris

Browse pgsql-general by date

	From	Date	Subject
Next Message	Kenneth Buckler	2010-12-21 15:05:17	Re: PostgreSQL Trusted Startup
Previous Message	Geoff Bowers	2010-12-21 14:14:35	Trouble uninstalling old postgresql installs on osx