| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Marcio Ribeiro <mribeiro(at)gmail(dot)com> |
| Cc: | pgsql-performance(at)postgresql(dot)org |
| Subject: | Re: LIMIT 1 poor query plan |
| Date: | 2015-10-10 15:45:36 |
| Message-ID: | 48410.1444491936@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
Marcio Ribeiro <mribeiro(at)gmail(dot)com> writes:
> Short question:
> Why would pg optimizer choose a worst (slower) query plan for a
> query with 'LIMIT 1' instead of, say, 'LIMIT 3'?
> Complete scenario:
> Query: 'SELECT * FROM a WHERE a.b_id = 42 ORDER BY created LIMIT 1'
> - b_id is a FK to b;
> - created is a datetime with the time of the creation of the row;
> - both 'b' and 'created' are indexed separately
> This query, with the LIMIT 1, uses the index on created, which is much
> slower (10x) than if it used the index on b_id
It's trying to avoid a sort; or to be less anthropomorphic, the estimated
cost of scanning the "created" index until it hits the first row with
b_id=42 is less than the estimated cost of collecting all the rows with
b_id=42 and then sorting them by "created". The estimates unfortunately
are kind of shaky because it's hard to predict how many rows will get
skipped before finding one with b_id=42.
If you do this type of query often enough to care about its performance,
you could consider creating a two-column index on (b_id, created)
(in that order).
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Marcio Ribeiro | 2015-10-10 17:45:20 | Re: LIMIT 1 poor query plan |
| Previous Message | Marcio Ribeiro | 2015-10-10 08:52:35 | LIMIT 1 poor query plan |