From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | David Wheeler <dwheeler(at)dgitsystems(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Queue table that quickly grows causes query planner to choose poor plan |
Date: | 2018-06-27 18:27:33 |
Message-ID: | 24443.1530124053@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
David Wheeler <dwheeler(at)dgitsystems(dot)com> writes:
> I'm having performance trouble with a particular set of queries. It goes a bit like this
> 1) queue table is initially empty, and very narrow (1 bigint column)
> 2) we insert ~30 million rows into queue table
> 3) we do a join with queue table to delete from another table (delete from a using queue where a.id<http://a.id> = queue.id<http://queue.id>), but postgres stats say that queue table is empty, so it uses a nested loop over all 30 million rows, taking forever
Although there's no way to have any useful pg_statistic stats if you won't
do an ANALYZE, the planner nonetheless can see the table's current
physical size, and what it normally does is to multiply the last-reported
tuple density (reltuples/relpages) by the current size. So if you're
getting an "empty table" estimate anyway, I have to suppose that the
table's state involves reltuples = 0 and relpages > 0. That's not a
good place to be in; it constrains the planner to believe that the table
is in fact devoid of tuples, because that's what the last ANALYZE saw.
Now, the initial state for a freshly-created or freshly-truncated table
is *not* that. It is reltuples = 0 and relpages = 0, representing an
undefined tuple density. Given that, the planner will make some guess
about average tuple size --- which is likely to be a very good guess,
for a table with only fixed-width columns --- and then compute a rowcount
estimate using that plus the observed physical size.
So I think your problem comes from oscillating between really-empty
and not-at-all-empty, and not using an idiomatic way of going back
to the empty state. Have you tried using TRUNCATE instead of DELETE?
> This queue table is empty 99% of the time, and the query in 3 runs immediately after step 2. Is there any likelyhood that tweaking the autoanalyze params would help in this case? I don't want to explicitly analyze the table between steps 2 and three either as there are other patterns of use where for example 0 rows are inserted in step 2 and this is expected to run very very quickly. Do I have any other options?
I am not following your aversion to sticking an ANALYZE in there,
either. It's not like inserting 30 million rows would be free.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | David Wheeler | 2018-06-28 00:00:57 | Re: Queue table that quickly grows causes query planner to choose poor plan |
Previous Message | Justin Pryzby | 2018-06-27 18:05:18 | Re: Queue table that quickly grows causes query planner to choose poor plan |