Re: Query Performance / Planner estimate off

From: Mats Olsen <mats(at)duneanalytics(dot)com>
To: Victor Yegorov <vyegorov(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query Performance / Planner estimate off
Date: 2020-10-20 17:43:27
Message-ID: d0f779f0-1516-56f9-11c3-e22f4f148c00@duneanalytics.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 10/20/20 6:51 PM, Victor Yegorov wrote:
> вт, 20 окт. 2020 г. в 16:50, Mats Olsen <mats(at)duneanalytics(dot)com
> <mailto:mats(at)duneanalytics(dot)com>>:
>
> On 10/20/20 3:04 PM, Victor Yegorov wrote:
>
>> вт, 20 окт. 2020 г. в 11:38, Mats Julian Olsen
>> <mats(at)duneanalytics(dot)com <mailto:mats(at)duneanalytics(dot)com>>:
>>
>> I'm looking for some help to manage queries against two large
>> tables.
>>
>>
>> Can you tell the version you're running currently and the output
>> of this query, please?
>>
>>     select name,setting,source from pg_settings where source not
>> in ('default','override');
>>
> Running "PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on
> x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1
> 20191008, 64-bit"
>
> Updated the gist to include the results forom pg_settings. Here's
> the direct link
> https://gist.githubusercontent.com/mewwts/9f11ae5e6a5951593b8999559f5418cf/raw/e5deebbbb48680e04570bec4e9a816fa009da34f/pg_settings
> <https://gist.githubusercontent.com/mewwts/9f11ae5e6a5951593b8999559f5418cf/raw/e5deebbbb48680e04570bec4e9a816fa009da34f/pg_settings>
>
> It looks like indexes currently chosen by the planner don't quite fit
> your query.
>
> I would create the following index (if it's possible to update schema):
>    ON "uniswap_v2.Pair_evt_Mint" (evt_tx_hash, evt_block_time)
I'll try to add this.
>
> Same for the second table, looks like
>   ON "ethereum.transactions" (hash, block_time)
> is a better fit for your query. In fact, I do not think
> `transactions_block_number_time` index is used frequently, 'cos second
> column of the index is a partitioning key.
I'll see if I can add it. This table is huge so normally we only make
changes to these when we redeploy the database.
>
> Currently planner wants to go via indexes 'cos you've made random
> access really cheap compared to sequential one (and your findings
> shows this).
> Perhaps on a NVMe disks this could work, but in your case you need to
> find the real bottleneck (therefore I asked for buffers).
>
> I would set `random_page_cost` to a 2.5 at least with your numbers.
> Also, I would check DB and indexes for bloat (just a guess now, 'cos
> your plans miss buffers figures)

Yeah, 1.1 seems way to low.

Here's the output of the explain (analyze, buffers, settings) you asked for:

vanilla: https://explain.depesz.com/s/Ktrd

set enable_nestloop=off: https://explain.depesz.com/s/mvSD

set enable_nestloop=off; set enable_seqscan=off:
https://explain.depesz.com/s/XIDo

>
>
> --
> Victor Yegorov

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Sebastian Dressler 2020-10-21 12:38:04 Re: Query Performance / Planner estimate off
Previous Message Mats Olsen 2020-10-20 17:40:40 Re: Query Performance / Planner estimate off