From: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
---|---|
To: | Don Seiler <don(at)seiler(dot)us>, Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Odd Row Estimates in Query Plan (rows=75) |
Date: | 2018-08-15 19:39:45 |
Message-ID: | f7751be2-57f6-e75e-fbd3-8f38e51c8fbc@aklaver.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 08/15/2018 12:31 PM, Don Seiler wrote:
> PostgreSQL 9.6.6 on CentOS.
>
> We have a report query that has gone from maybe a few seconds to run to
> a few minutes to run since mid-July. Looking at the output of EXPLAIN
> ANALYZE, the row count estimates are way off, even though this table was
> just analyzed a day or so ago. What's more bizarre to me is that the row
> count esimate is *always* 75 for every node of the plan, where the
> actual rows is in the hundreds or thousands. This table is one of the
> busiest tables in our production database (many inserts and updates). It
> is autovacuumed and autoanalyzed a few times per week, although I'm
> looking to change it to a nightly manual schedule to avoid daytime
> autovacuums.
>
> Hash Join (cost=1869142.34..1869146.15 rows=75 width=88) (actual
> time=179877.869..179878.011 rows=759 loops=1)
> Hash Cond: (stores.pkey = lt.store_pkey)
> Buffers: shared hit=1654593 read=331897 dirtied=249
> -> Seq Scan on stores (cost=0.00..2.77 rows=77 width=22) (actual
> time=0.007..0.023 rows=78 loops=1)
> Buffers: shared hit=2
> -> Hash (cost=1869141.40..1869141.40 rows=75 width=50) (actual
> time=179877.847..179877.847 rows=759 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 73kB
> Buffers: shared hit=1654591 read=331897 dirtied=249
> -> Subquery Scan on lt (cost=1869138.59..1869141.40 rows=75
> width=50) (actual time=179875.976..179877.697 rows=759 loops=1)
> Buffers: shared hit=1654591 read=331897 dirtied=249
> -> GroupAggregate (cost=1869138.59..1869140.65 rows=75
> width=50) (actual time=179875.976..179877.606 rows=759 loops=1)
> Group Key: lts.store_pkey, lts.owner,
> (date_trunc('minute'::text, lts.date_gifted))
> Filter: (count(*) IS NOT NULL)
> Buffers: shared hit=1654591 read=331897 dirtied=249
> -> Sort (cost=1869138.59..1869138.78 rows=75
> width=42) (actual time=179875.961..179876.470 rows=6731 loops=1)
> Sort Key: lts.store_pkey,
> lts.entry_source_owner, (date_trunc('minute'::text, lts.date_gifted))
> Sort Method: quicksort Memory: 757kB
> Buffers: shared hit=1654591 read=331897
> dirtied=249
> -> Index Scan using gifts_date_added on
> gifts lts (cost=0.56..1869136.25 rows=75 width=42) (actual
> time=190.657..179870.165 rows=6731 loops=1)
> Index Cond: ((date_added > '2018-07-14
> 11:13:05'::timestamp without time zone) AND (date_added < '2018-08-13
> 14:14:21'::timestamp without time zone))
> Filter: ((date_gifted >= '2018-08-13
> 11:13:05'::timestamp without time zone) AND (date_gifted < '2018-08-13
> 14:14:21'::timestamp without time zone))
> Rows Removed by Filter: 938197
> Buffers: shared hit=1654591
> read=331897 dirtied=249
> Planning time: 0.426 ms
> Execution time: 179893.894 ms
>
> I don't have a version of this query from prior to this summer, but
> getting explain plan for older data from older sandboxes show a similar
> plan.
I don't have an answer, just a question:
Can you provide the actual query and the table schema?
>
> Sidenote: I am suggesting that an index be added on the date_gifted
> field as that is far more selective and avoids throwing rows away.
> However I'm very interested in why every node dealing with the gifts
> table thinks rows=75 when the actual is much, much higher. And 75 seems
> like too round of a number to be random?
>
> --
> Don Seiler
> www.seiler.us <http://www.seiler.us>
--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com
From | Date | Subject | |
---|---|---|---|
Next Message | Don Seiler | 2018-08-15 20:03:14 | Re: Odd Row Estimates in Query Plan (rows=75) |
Previous Message | Tom Lane | 2018-08-15 19:31:30 | Re: Code of Conduct plan |