Re: Tweaking PG (again)

From: "Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>, "PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Tweaking PG (again)
Date: 2008-11-14 04:57:32
Message-ID: e373d31e0811132057s7ed83524kd043eff1cbb854da@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Nov 14, 2008 at 10:57 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Yeah, but note that the planner knows darn well that this will be an
> expensive query --- 493427.14 cost units estimated to fetch 2 rows!
>
> My interpretation is that the condition on user_id is horribly
> nonselective (at least for this value of user_id) and the planner knows
> it. The condition on url_encrypted *is* selective, and the planner
> knows that too, but there's nothing it can do about it --- the best
> available plan is to fetch all the rows matching by user_id and then
> filter them on url_encrypted.
>
> Consider creating an index on url_encrypted if you need this type of
> query to go fast.

Thanks Tom.

Yes, I have considered indexing url_encrypted too. That would be a
very large index though, space-wise, but may increase the speed. You
are right that only "user_id" is not too selective.

The two together (user_id, url_encrypted) should be unique in my case.
So I can now think of making a unique index with these two fields.

Questions:

1. If I have a unique index on (user_id, url_encrypted), then will
queries asking only for user_id also use this index? Or should i
simply have separate indexes on user_id and url_encrypted? I vaguely
recall reading somewhere that compound indexes may have been useful in
MySQL but according to PG's more advanced planner having two separate
indexes on the columns works better.

2. Is there a production equivalent of REINDEX? Last time I tried
CREATE INDEX CONCURRENTLY overnight, by the morning it had croaked
with these errors:

---
ERROR: deadlock detected
DETAIL: Process 6663 waits for ShareLock on transaction 999189656;
blocked by process 31768.
Process 31768 waits for ShareUpdateExclusiveLock on relation 50002 of
database 41249; blocked by process 6663
---

Naturally, when I see the table now, this attempted index is marked
"INVALID". The manual says I should either drop it and recreate it, or
REINDEX it again. But this is a production database on a semi-busy
website and cannot take time off. What is my recourse for a kind of
live REINDEX? Can I create a new index without locking the database?

3. Basically, design wise, I use url_encrypted to check if a user_id
already has a url associated with him. This kind of a unique
constraint check (user_id, url_encrypted). Used only when INSERTing a
new record -- if the user has it already, then simply update values if
needed and return the current row. Otherwise, INSERT new row. I do
this check+update+insert with three SQLs. Is there one way of doing it
in SQL in PG?

Many thanks for all the help thus far!

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adam Seering 2008-11-14 05:31:25 Seek within Large Object, within PL/* function?
Previous Message Scott Marlowe 2008-11-14 04:02:23 Re: backup and permissions