From: | Alexander Staubo <alex(at)bengler(dot)no> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Bad query plan with high-cardinality column |
Date: | 2013-02-22 21:31:48 |
Message-ID: | F8FDDA2A2FCB47E8A13F0411BA37311E@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Friday, February 22, 2013 at 21:33 , Tom Lane wrote:
> The reason is that the LIMIT may stop the query before it's scanned all
> of the index. The planner estimates on the assumption that the desired
> rows are roughly uniformly distributed within the created_at index, and
> on that assumption, it looks like this query will stop fairly soon ...
> but evidently, that's wrong. On the other hand, it knows quite well
> that the other plan will require pulling out 5000-some rows and then
> sorting them before it can return anything, so that's not going to be
> exactly instantaneous either.
>
> In this example, I'll bet that conversation_id and created_at are pretty
> strongly correlated, and that most or all of the rows with that specific
> conversation_id are quite far down the created_at ordering, so that the
> search through the index takes a long time to run. OTOH, with another
> conversation_id the same plan might run almost instantaneously.
That's right. So I created a composite index, and not only does this make the plan correct, but the planner now chooses a much more efficient plan than the previous index that indexed only on "conversation_id":
Limit (cost=0.00..30.80 rows=13 width=12) (actual time=0.042..0.058 rows=13 loops=1)
Buffers: shared hit=8
-> Index Scan using index_comments_on_conversation_id_and_created_at on comments (cost=0.00..14127.83 rows=5964 width=12) (actual time=0.039..0.054 rows=13 loops=1)
Index Cond: (conversation_id = 3975979)
Buffers: shared hit=8
Total runtime: 0.094 ms
Is this because it can get the value of "created_at" from the index, or is it because it can know that the index is pre-sorted, or both?
Very impressed that Postgres can use a multi-column index for this. I just assumed, wrongly, that it couldn't. I will have to go review my other tables now and see if they can benefit from multi-column indexes.
Thanks!
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Staubo | 2013-02-22 21:34:21 | Re: Bad query plan with high-cardinality column |
Previous Message | Kevin Grittner | 2013-02-22 20:47:56 | Re: Bad query plan with high-cardinality column |