From: | Mats Julian Olsen <mats(at)duneanalytics(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Query Performance / Planner estimate off |
Date: | 2020-10-20 09:37:42 |
Message-ID: | CAARtqpGxzna+jj3LYNTacHxbC03-KdBs7XeQUaqwEkUJ+5QMbw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Dear Postgres community,
I'm looking for some help to manage queries against two large tables.
Context:
We run a relatively large postgresql instance (5TB, 32 vCPU, 120GB RAM)
with a hybrid transactional/analytical workload. Data is written in batches
every 15 seconds or so, and the all queryable tables are append-only (we
never update or delete). Our users can run analytical queries on top of
these tables.
We recently came across a series of troublesome queries one of which I'll
dive into here.
Please see the following gist for both the query we run and the \d+ output:
https://gist.github.com/mewwts/9f11ae5e6a5951593b8999559f5418cf.
The tables in question are:
- `ethereum.transactions`: 833M rows, partitioned, 171M rows after WHERE
- `uniswap_v2."Pair_evt_Swap": 12M rows, not partitioned, 12M rows after
WHERE
The crux of our issue is that the query planner chooses a nested loop join
for this query. Essentially making this query (and other queries) take a
very long time to complete. In contrast, by toggling `enable_nestloop` and
`enable_seqscan` off we can take the total runtime down from 16 minutes to
2 minutes.
1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR
2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK
3) enable_nestloop=off; enable_seqscan=off (2 min):
https://explain.depesz.com/s/0WXx
How can I get Postgres not to loop over 12M rows?
Let me know if there is anything I left out here that would be useful for
further debugging.
--
Mats
CTO @ Dune Analytics
We're hiring: https://careers.duneanalytics.com
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2020-10-20 09:50:14 | Re: Query Performance / Planner estimate off |
Previous Message | aditya desai | 2020-10-20 07:17:45 | Re: CPU Consuming query. Sequential scan despite indexing. |