From: | Shaun Thomas <sthomas(at)optionshouse(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Setting Statistics on Functional Indexes |
Date: | 2012-10-24 19:31:11 |
Message-ID: | 508841FF.9040303@optionshouse.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 10/24/2012 02:11 PM, Tom Lane wrote:
> It's not particularly (not that you've even defined what you think
> "optimistic" is, much less mentioned what baseline you're comparing
> to).
The main flaw with my example is that it's random. But I swear I'm not
making it up! :)
There seems to be a particularly nasty edge case we're triggering, then.
Like I said, it's worse when col1+col2 don't match anything. In that
case, it's using the trunc index on the date column, which has
demonstrably worse performance. Here are the two analyzes I got
before/after front-loading statistics.
Before stats increase:
Sort (cost=9.38..9.39 rows=1 width=23) (actual time=78.282..78.282
rows=0 loops=1)
Sort Key: id, action_date
Sort Method: quicksort Memory: 25kB
-> Index Scan using idx_date_test_action_date_trunc on date_test
(cost=0.00..9.37 rows=1 width=23) (actual time=78.274..78.274 rows=0
loops=1)
Index Cond: (date_trunc('day'::text, action_date) >=
'2012-10-24 00:00:00'::timestamp without time zone)
Filter: (((col1)::text = 'S:96'::text) AND (col2 = 657::numeric))
Total runtime: 78.317 ms
And then after. I used your unofficial trick to set it to 1000:
alter index idx_date_test_action_date_trunc
alter column date_trunc set statistics 1000;
analyze date_test;
Sort (cost=9.83..9.83 rows=1 width=23) (actual time=0.038..0.038
rows=0 loops=1)
Sort Key: id, action_date
Sort Method: quicksort Memory: 25kB
-> Index Scan using idx_date_test_col1_col2 on date_test
(cost=0.00..9.82 rows=1 width=23) (actual time=0.030..0.030 rows=0 loops=1)
Index Cond: (((col1)::text = 'S:96'::text) AND (col2 =
657::numeric))
Filter: (date_trunc('day'::text, action_date) >= '2012-10-24
00:00:00'::timestamp without time zone)
Total runtime: 0.066 ms
This is on a bone-stock PG 9.1.6 from Ubuntu 12.04 LTS, with
default_statistics increased to 500. The only thing I bumped up was the
functional index between those two query plans.
But then I noticed something else. I reverted back to the old 500
default for everything, and added an index:
create index idx_date_test_action_date_trunc_col1
on date_test (date_trunc('day', action_date), col1);
I think we can agree that this index would be more selective than the
one on date_trunc by itself. Yet:
Sort (cost=9.38..9.39 rows=1 width=23) (actual time=77.055..77.055
rows=0 loops=1)
Sort Key: id, action_date
Sort Method: quicksort Memory: 25kB
-> Index Scan using idx_date_test_action_date_trunc on date_test
(cost=0.00..9.37 rows=1 width=23) (actual time=77.046..77.046 rows=0
loops=1)
Index Cond: (date_trunc('day'::text, action_date) >=
'2012-10-24 00:00:00'::timestamp without time zone)
Filter: (((col1)::text = 'S:96'::text) AND (col2 = 657::numeric))
Total runtime: 77.091 ms
All I have to say about that is: wat.
--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas(at)optionshouse(dot)com
______________________________________________
See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
From | Date | Subject | |
---|---|---|---|
Next Message | Shaun Thomas | 2012-10-24 19:54:52 | Re: Setting Statistics on Functional Indexes |
Previous Message | Tom Lane | 2012-10-24 19:11:48 | Re: Setting Statistics on Functional Indexes |