Re: Very slow Query compared to Oracle / SQL - Server

From: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
To: Semen Yefimenko <semen(dot)yefimenko(at)gmail(dot)com>
Cc: Imre Samu <pella(dot)samu(at)gmail(dot)com>, pgsql-performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Very slow Query compared to Oracle / SQL - Server
Date: 2021-05-07 09:55:58
Message-ID: CAM+6J96ybUF4=bxh1Mc9VnYXfiJyZiHM_YxpasdF86cRHQMvAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Is this on windows ?

I see a thread that mentions of performance penalty due to parallel worker

There is a mailing thread with subject line -
Huge performance penalty with parallel queries in Windows x64 v. Linux x64

On Fri, 7 May 2021 at 2:33 PM Semen Yefimenko <semen(dot)yefimenko(at)gmail(dot)com>
wrote:

> As mentionend in the slack comments :
>
> SELECT pg_size_pretty(pg_relation_size('logtable')) as table_size,
> pg_size_pretty(pg_relation_size('idx_entrytype')) as index_size,
> (pgstattuple('logtable')).dead_tuple_percent;
>
> table_size | index_size | dead_tuple_percent
> ------------+------------+--------------------
> 3177 MB | 289 MB | 0
>
> I have roughly 6 indexes which all have around 300 MB
>
> SELECT pg_relation_size('logtable') as table_size,
> pg_relation_size(idx_entrytype) as index_size,
> 100-(pgstatindex('idx_entrytype')).avg_leaf_density as bloat_ratio
>
> table_size | index_size | bloat_ratio
> ------------+------------+-------------------
> 3331694592 | 302555136 | 5.219999999999999
>
> Your queries:
>
> n_live_tup n_dead_tup
> 14118380 0
>
>
> For testing, I've also been running VACUUM and ANALYZE pretty much before
> every test run.
>
> Am Fr., 7. Mai 2021 um 10:44 Uhr schrieb Vijaykumar Jain <
> vijaykumarjain(dot)github(at)gmail(dot)com>:
>
>> ok one last thing, not to be a PITA, but just in case if this helps.
>>
>> postgres=# SELECT * FROM pg_stat_user_indexes where relname = 'logtable';
>> postgres=# SELECT * FROM pg_stat_user_tables where relname = 'logtable';
>> basically, i just to verify if the table is not bloated.
>> looking at *n_live_tup* vs *n_dead_tup* would help understand it.
>>
>> if you see too many dead tuples,
>> vacuum (ANALYZE,verbose) logtable; -- would get rid of dead tuples (if
>> there are no tx using the dead tuples)
>>
>> and then run your query.
>>
>> Thanks,
>> Vijay
>>
>>
>>
>>
>>
>>
>> On Fri, 7 May 2021 at 13:34, Semen Yefimenko <semen(dot)yefimenko(at)gmail(dot)com>
>> wrote:
>>
>>> Sorry if I'm cumulatively answering everyone in one E-Mail, I'm not sure
>>> how I'm supposed to do it. (single E-Mails vs many)
>>>
>>>
>>>> Can you try tuning by increasing the shared_buffers slowly in steps of
>>>> 500MB, and running explain analyze against the query.
>>>
>>>
>>> -- 2500 MB shared buffers - random_page_cost = 1;
>>> Gather Merge (cost=343085.23..392186.19 rows=420836 width=2542) (actual
>>> time=2076.329..3737.050 rows=516517 loops=1)
>>> Output: column1, .. , column54
>>> Workers Planned: 2
>>> Workers Launched: 2
>>> Buffers: shared hit=295446
>>> -> Sort (cost=342085.21..342611.25 rows=210418 width=2542) (actual
>>> time=2007.487..2202.707 rows=172172 loops=3)
>>> Output: column1, .. , column54
>>> Sort Key: logtable.timestampcol DESC
>>> Sort Method: quicksort Memory: 65154kB
>>> Worker 0: Sort Method: quicksort Memory: 55707kB
>>> Worker 1: Sort Method: quicksort Memory: 55304kB
>>> Buffers: shared hit=295446
>>> Worker 0: actual time=1963.969..2156.624 rows=161205 loops=1
>>> Buffers: shared hit=91028
>>> Worker 1: actual time=1984.700..2179.697 rows=161935 loops=1
>>> Buffers: shared hit=92133
>>> -> Parallel Bitmap Heap Scan on schema.logtable
>>> (cost=5546.39..323481.21 rows=210418 width=2542) (actual
>>> time=322.125..1618.971 rows=172172 loops=3)
>>> Output: column1, .. , column54
>>> Recheck Cond: ((logtable.entrytype = 4000) OR
>>> (logtable.entrytype = 4001) OR (logtable.entrytype = 4002))
>>> Filter: (logtable.archivestatus <= 1)
>>> Heap Blocks: exact=110951
>>> Buffers: shared hit=295432
>>> Worker 0: actual time=282.201..1595.117 rows=161205 loops=1
>>> Buffers: shared hit=91021
>>> Worker 1: actual time=303.671..1623.299 rows=161935 loops=1
>>> Buffers: shared hit=92126
>>> -> BitmapOr (cost=5546.39..5546.39 rows=510578 width=0)
>>> (actual time=199.119..199.119 rows=0 loops=1)
>>> Buffers: shared hit=1334
>>> -> Bitmap Index Scan on idx_entrytype
>>> (cost=0.00..682.13 rows=67293 width=0) (actual time=28.856..28.857
>>> rows=65970 loops=1)
>>> Index Cond: (logtable.entrytype = 4000)
>>> Buffers: shared hit=172
>>> -> Bitmap Index Scan on idx_entrytype
>>> (cost=0.00..2223.63 rows=219760 width=0) (actual time=108.871..108.872
>>> rows=225283 loops=1)
>>> Index Cond: (logtable.entrytype = 4001)
>>> Buffers: shared hit=581
>>> -> Bitmap Index Scan on idx_entrytype
>>> (cost=0.00..2261.87 rows=223525 width=0) (actual time=61.377..61.377
>>> rows=225264 loops=1)
>>> Index Cond: (logtable.entrytype = 4002)
>>> Buffers: shared hit=581
>>> Settings: random_page_cost = '1', temp_buffers = '80MB', work_mem = '1GB'
>>> Planning Time: 0.940 ms
>>> Execution Time: 4188.083 ms
>>>
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> -- 3000 MB shared buffers - random_page_cost = 1;
>>> Gather Merge (cost=343085.23..392186.19 rows=420836 width=2542) (actual
>>> time=2062.280..3763.408 rows=516517 loops=1)
>>> Output: column1, .. , column54
>>> Workers Planned: 2
>>> Workers Launched: 2
>>> Buffers: shared hit=295446
>>> -> Sort (cost=342085.21..342611.25 rows=210418 width=2542) (actual
>>> time=1987.933..2180.422 rows=172172 loops=3)
>>> Output: column1, .. , column54
>>> Sort Key: logtable.timestampcol DESC
>>> Sort Method: quicksort Memory: 66602kB
>>> Worker 0: Sort Method: quicksort Memory: 55149kB
>>> Worker 1: Sort Method: quicksort Memory: 54415kB
>>> Buffers: shared hit=295446
>>> Worker 0: actual time=1963.059..2147.916 rows=159556 loops=1
>>> Buffers: shared hit=89981
>>> Worker 1: actual time=1949.726..2136.200 rows=158554 loops=1
>>> Buffers: shared hit=90141
>>> -> Parallel Bitmap Heap Scan on schema.logtable
>>> (cost=5546.39..323481.21 rows=210418 width=2542) (actual
>>> time=340.705..1603.796 rows=172172 loops=3)
>>> Output: column1, .. , column54
>>> Recheck Cond: ((logtable.entrytype = 4000) OR
>>> (logtable.entrytype = 4001) OR (logtable.entrytype = 4002))
>>> Filter: (logtable.archivestatus <= 1)
>>> Heap Blocks: exact=113990
>>> Buffers: shared hit=295432
>>> Worker 0: actual time=317.918..1605.548 rows=159556 loops=1
>>> Buffers: shared hit=89974
>>> Worker 1: actual time=304.744..1589.221 rows=158554 loops=1
>>> Buffers: shared hit=90134
>>> -> BitmapOr (cost=5546.39..5546.39 rows=510578 width=0)
>>> (actual time=218.972..218.973 rows=0 loops=1)
>>> Buffers: shared hit=1334
>>> -> Bitmap Index Scan on idx_entrytype
>>> (cost=0.00..682.13 rows=67293 width=0) (actual time=37.741..37.742
>>> rows=65970 loops=1)
>>> Index Cond: (logtable.entrytype = 4000)
>>> Buffers: shared hit=172
>>> -> Bitmap Index Scan on idx_entrytype
>>> (cost=0.00..2223.63 rows=219760 width=0) (actual time=119.120..119.121
>>> rows=225283 loops=1)
>>> Index Cond: (logtable.entrytype = 4001)
>>> Buffers: shared hit=581
>>> -> Bitmap Index Scan on idx_entrytype
>>> (cost=0.00..2261.87 rows=223525 width=0) (actual time=62.097..62.098
>>> rows=225264 loops=1)
>>> Index Cond: (logtable.entrytype = 4002)
>>> Buffers: shared hit=581
>>> Settings: random_page_cost = '1', temp_buffers = '80MB', work_mem = '1GB'
>>> Planning Time: 2.717 ms
>>> Execution Time: 4224.670 ms
>>>
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> -- 3500 MB shared buffers - random_page_cost = 1;
>>> Gather Merge (cost=343085.23..392186.19 rows=420836 width=2542) (actual
>>> time=3578.155..4932.858 rows=516517 loops=1)
>>> Output: column1, .. , column54
>>> Workers Planned: 2
>>> Workers Launched: 2
>>> Buffers: shared hit=14 read=295432 written=67
>>> -> Sort (cost=342085.21..342611.25 rows=210418 width=2542) (actual
>>> time=3482.159..3677.227 rows=172172 loops=3)
>>> Output: column1, .. , column54
>>> Sort Key: logtable.timestampcol DESC
>>> Sort Method: quicksort Memory: 58533kB
>>> Worker 0: Sort Method: quicksort Memory: 56878kB
>>> Worker 1: Sort Method: quicksort Memory: 60755kB
>>> Buffers: shared hit=14 read=295432 written=67
>>> Worker 0: actual time=3435.131..3632.985 rows=166842 loops=1
>>> Buffers: shared hit=7 read=95783 written=25
>>> Worker 1: actual time=3441.545..3649.345 rows=179354 loops=1
>>> Buffers: shared hit=5 read=101608 written=20
>>> -> Parallel Bitmap Heap Scan on schema.logtable
>>> (cost=5546.39..323481.21 rows=210418 width=2542) (actual
>>> time=345.111..3042.932 rows=172172 loops=3)
>>> Output: column1, .. , column54
>>> Recheck Cond: ((logtable.entrytype = 4000) OR
>>> (logtable.entrytype = 4001) OR (logtable.entrytype = 4002))
>>> Filter: (logtable.archivestatus <= 1)
>>> Heap Blocks: exact=96709
>>> Buffers: shared hit=2 read=295430 written=67
>>> Worker 0: actual time=300.525..2999.403 rows=166842 loops=1
>>> Buffers: shared read=95783 written=25
>>> Worker 1: actual time=300.552..3004.859 rows=179354 loops=1
>>> Buffers: shared read=101606 written=20
>>> -> BitmapOr (cost=5546.39..5546.39 rows=510578 width=0)
>>> (actual time=241.996..241.997 rows=0 loops=1)
>>> Buffers: shared hit=2 read=1332
>>> -> Bitmap Index Scan on idx_entrytype
>>> (cost=0.00..682.13 rows=67293 width=0) (actual time=37.129..37.130
>>> rows=65970 loops=1)
>>> Index Cond: (logtable.entrytype = 4000)
>>> Buffers: shared read=172
>>> -> Bitmap Index Scan on idx_entrytype
>>> (cost=0.00..2223.63 rows=219760 width=0) (actual time=131.051..131.052
>>> rows=225283 loops=1)
>>> Index Cond: (logtable.entrytype = 4001)
>>> Buffers: shared hit=1 read=580
>>> -> Bitmap Index Scan on idx_entrytype
>>> (cost=0.00..2261.87 rows=223525 width=0) (actual time=73.800..73.800
>>> rows=225264 loops=1)
>>> Index Cond: (logtable.entrytype = 4002)
>>> Buffers: shared hit=1 read=580
>>> Settings: random_page_cost = '1', temp_buffers = '80MB', work_mem = '1GB'
>>> Planning Time: 0.597 ms
>>> Execution Time: 5389.811 ms
>>>
>>>
>>> This doesn't seem to have had an effect.
>>> Thanks for the suggestion.
>>>
>>> Have you try of excluding not null from index? Can you give dispersion
>>>> of archivestatus?
>>>>
>>>
>>> Yes I have, it yielded the same performance boost as :
>>>
>>> create index test on logtable(entrytype) where archivestatus <= 1;
>>>
>>> I wonder what the old query plan was...
>>>> Would you include links to your prior correspondance ?
>>>
>>>
>>> So prior Execution Plans are present in the SO.
>>> The other forums I've tried are the official slack channel :
>>> https://postgresteam.slack.com/archives/C0FS3UTAP/p1620286295228600
>>> And SO :
>>> https://stackoverflow.com/questions/67401792/slow-running-postgresql-query
>>> But I think most of the points discussed in these posts have already
>>> been mentionend by you except bloating of indexes.
>>>
>>> Oracle is apparently doing a single scan on "entrytype".
>>>> As a test, you could try forcing that, like:
>>>> begin; SET enable_bitmapscan=off ; explain (analyze) [...]; rollback;
>>>> or
>>>> begin; DROP INDEX idx_arcstatus; explain (analyze) [...]; rollback;
>>>
>>>
>>> I've tried enable_bitmapscan=off but it didn't yield any good results.
>>>
>>> -- 2000 MB shared buffers - random_page_cost = 4 - enable_bitmapscan to
>>> off
>>> Gather Merge (cost=543949.72..593050.69 rows=420836 width=2542) (actual
>>> time=7716.031..9043.399 rows=516517 loops=1)
>>> Output: column1, .., column54
>>> Workers Planned: 2
>>> Workers Launched: 2
>>> Buffers: shared hit=192 read=406605
>>> -> Sort (cost=542949.70..543475.75 rows=210418 width=2542) (actual
>>> time=7642.666..7835.527 rows=172172 loops=3)
>>> Output: column1, .., column54
>>> Sort Key: logtable.timestampcol DESC
>>> Sort Method: quicksort Memory: 58803kB
>>> Worker 0: Sort Method: quicksort Memory: 60376kB
>>> Worker 1: Sort Method: quicksort Memory: 56988kB
>>> Buffers: shared hit=192 read=406605
>>> Worker 0: actual time=7610.482..7814.905 rows=177637 loops=1
>>> Buffers: shared hit=78 read=137826
>>> Worker 1: actual time=7607.645..7803.561 rows=167316 loops=1
>>> Buffers: shared hit=80 read=132672
>>> -> Parallel Seq Scan on schema.logtable (cost=0.00..524345.70
>>> rows=210418 width=2542) (actual time=1.669..7189.365 rows=172172 loops=3)
>>> Output: column1, .., column54
>>> Filter: ((logtable.acrhivestatus <= 1) AND
>>> ((logtable.entrytype = 4000) OR (logtable.entrytype = 4001) OR
>>> (logtable.entrytype = 4002)))
>>> Rows Removed by Filter: 4533459
>>> Buffers: shared hit=96 read=406605
>>> Worker 0: actual time=1.537..7158.286 rows=177637 loops=1
>>> Buffers: shared hit=30 read=137826
>>> Worker 1: actual time=1.414..7161.670 rows=167316 loops=1
>>> Buffers: shared hit=32 read=132672
>>> Settings: enable_bitmapscan = 'off', temp_buffers = '80MB', work_mem =
>>> '1GB'
>>> Planning Time: 0.725 ms
>>> Execution Time: 9500.928 ms
>>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> 2000 MB shared buffers - random_page_cost = 4 - -- 2000 -- 2000 MB shared
>>> buffers - random_page_cost = 1 - enable_bitmapscan to off
>>> Gather Merge (cost=543949.72..593050.69 rows=420836 width=2542) (actual
>>> time=7519.032..8871.433 rows=516517 loops=1)
>>> Output: column1, .., column54
>>> Workers Planned: 2
>>> Workers Launched: 2
>>> Buffers: shared hit=576 read=406221
>>> -> Sort (cost=542949.70..543475.75 rows=210418 width=2542) (actual
>>> time=7451.958..7649.480 rows=172172 loops=3)
>>> Output: column1, .., column54
>>> Sort Key: logtable.timestampcol DESC
>>> Sort Method: quicksort Memory: 58867kB
>>> Worker 0: Sort Method: quicksort Memory: 58510kB
>>> Worker 1: Sort Method: quicksort Memory: 58788kB
>>> Buffers: shared hit=576 read=406221
>>> Worker 0: actual time=7438.271..7644.241 rows=172085 loops=1
>>> Buffers: shared hit=203 read=135166
>>> Worker 1: actual time=7407.574..7609.922 rows=172948 loops=1
>>> Buffers: shared hit=202 read=135225
>>> -> Parallel Seq Scan on schema.logtable (cost=0.00..524345.70
>>> rows=210418 width=2542) (actual time=2.839..7017.729 rows=172172 loops=3)
>>> Output: column1, .., column54
>>> Filter: ((logtable.acrhivestatus <= 1) AND
>>> ((logtable.entrytype = 4000) OR (logtable.entrytype = 4001) OR
>>> (logtable.entrytype = 4002)))
>>> Rows Removed by Filter: 4533459
>>> Buffers: shared hit=480 read=406221
>>> Worker 0: actual time=2.628..7006.420 rows=172085 loops=1
>>> Buffers: shared hit=155 read=135166
>>> Worker 1: actual time=3.948..6978.154 rows=172948 loops=1
>>> Buffers: shared hit=154 read=135225
>>> Settings: enable_bitmapscan = 'off', random_page_cost = '1',
>>> temp_buffers = '80MB', work_mem = '1GB'
>>> Planning Time: 0.621 ms
>>> Execution Time: 9339.457 ms
>>>
>>> Have you tune shared buffers enough? Each block is of 8k by default.
>>>> BTW, please try to reset random_page_cost.
>>>
>>>
>>> Look above.
>>>
>>> I will try upgrading the minor version next.
>>> I will also try setting up a 13.X version locally and import the data
>>> from 12.2 to 13.X and see if it might be faster.
>>>
>>>
>>> Am Do., 6. Mai 2021 um 23:16 Uhr schrieb Imre Samu <pella(dot)samu(at)gmail(dot)com
>>> >:
>>>
>>>> *> Postgres Version : *PostgreSQL 12.2,
>>>> > ... ON ... USING btree
>>>>
>>>> IMHO:
>>>> The next minor (bugix&security) release is near ( expected ~ May 13th,
>>>> 2021 ) https://www.postgresql.org/developer/roadmap/
>>>> so you can update your PostgreSQL to 12.7 ( + full Reindexing
>>>> recommended ! )
>>>>
>>>> You can find a lot of B-tree index-related fixes.
>>>> https://www.postgresql.org/docs/12/release-12-3.html Release date:
>>>> 2020-05-14
>>>> - Fix possible undercounting of deleted B-tree index pages in VACUUM
>>>> VERBOSE output
>>>> - Fix wrong bookkeeping for oldest deleted page in a B-tree index
>>>> - Ensure INCLUDE'd columns are always removed from B-tree pivot tuples
>>>> https://www.postgresql.org/docs/12/release-12-4.html
>>>> - Avoid repeated marking of dead btree index entries as dead
>>>> https://www.postgresql.org/docs/12/release-12-5.html
>>>> - Fix failure of parallel B-tree index scans when the index condition
>>>> is unsatisfiable
>>>> https://www.postgresql.org/docs/12/release-12-6.html Release date:
>>>> 2021-02-11
>>>>
>>>>
>>>> > COLLATE pg_catalog."default"
>>>>
>>>> You can test the "C" Collation in some columns (keys ? ) ; in
>>>> theory, it should be faster :
>>>> "The drawback of using locales other than C or POSIX in PostgreSQL is
>>>> its performance impact. It slows character handling and prevents ordinary
>>>> indexes from being used by LIKE. For this reason use locales only if you
>>>> actually need them."
>>>> https://www.postgresql.org/docs/12/locale.html
>>>>
>>>> https://www.postgresql.org/message-id/flat/CAF6DVKNU0vb4ZeQQ-%3Dagg69QJU3wdjPnMYYrPYY7CKc6iOU7eQ%40mail.gmail.com
>>>>
>>>> Best,
>>>> Imre
>>>>
>>>>
>>>> Semen Yefimenko <semen(dot)yefimenko(at)gmail(dot)com> ezt írta (időpont: 2021.
>>>> máj. 6., Cs, 16:38):
>>>>
>>>>> Hi there,
>>>>>
>>>>> I've recently been involved in migrating our old system to SQL Server
>>>>> and then PostgreSQL. Everything has been working fine so far but now after
>>>>> executing our tests on Postgres, we saw a very slow running query on a
>>>>> large table in our database.
>>>>> I have tried asking on other platforms but no one has been able to
>>>>> give me a satisfying answer.
>>>>>
>>>>> *Postgres Version : *PostgreSQL 12.2, compiled by Visual C++ build
>>>>> 1914, 64-bit
>>>>> No notable errors in the Server log and the Postgres Server itself.
>>>>>
>>>>> The table structure :
>>>>>
>>>>> CREATE TABLE logtable
>>>>> (
>>>>> key character varying(20) COLLATE pg_catalog."default" NOT NULL,
>>>>> id integer,
>>>>> column3 integer,
>>>>> column4 integer,
>>>>> column5 integer,
>>>>> column6 integer,
>>>>> column7 integer,
>>>>> column8 integer,
>>>>> column9 character varying(128) COLLATE pg_catalog."default",
>>>>> column10 character varying(2048) COLLATE pg_catalog."default",
>>>>> column11 character varying(2048) COLLATE pg_catalog."default",
>>>>> column12 character varying(2048) COLLATE pg_catalog."default",
>>>>> column13 character varying(2048) COLLATE pg_catalog."default",
>>>>> column14 character varying(2048) COLLATE pg_catalog."default",
>>>>> column15 character varying(2048) COLLATE pg_catalog."default",
>>>>> column16 character varying(2048) COLLATE pg_catalog."default",
>>>>> column17 character varying(2048) COLLATE pg_catalog."default",
>>>>> column18 character varying(2048) COLLATE pg_catalog."default",
>>>>> column19 character varying(2048) COLLATE pg_catalog."default",
>>>>> column21 character varying(256) COLLATE pg_catalog."default",
>>>>> column22 character varying(256) COLLATE pg_catalog."default",
>>>>> column23 character varying(256) COLLATE pg_catalog."default",
>>>>> column24 character varying(256) COLLATE pg_catalog."default",
>>>>> column25 character varying(256) COLLATE pg_catalog."default",
>>>>> column26 character varying(256) COLLATE pg_catalog."default",
>>>>> column27 character varying(256) COLLATE pg_catalog."default",
>>>>> column28 character varying(256) COLLATE pg_catalog."default",
>>>>> column29 character varying(256) COLLATE pg_catalog."default",
>>>>> column30 character varying(256) COLLATE pg_catalog."default",
>>>>> column31 character varying(256) COLLATE pg_catalog."default",
>>>>> column32 character varying(256) COLLATE pg_catalog."default",
>>>>> column33 character varying(256) COLLATE pg_catalog."default",
>>>>> column34 character varying(256) COLLATE pg_catalog."default",
>>>>> column35 character varying(256) COLLATE pg_catalog."default",
>>>>> entrytype integer,
>>>>> column37 bigint,
>>>>> column38 bigint,
>>>>> column39 bigint,
>>>>> column40 bigint,
>>>>> column41 bigint,
>>>>> column42 bigint,
>>>>> column43 bigint,
>>>>> column44 bigint,
>>>>> column45 bigint,
>>>>> column46 bigint,
>>>>> column47 character varying(128) COLLATE pg_catalog."default",
>>>>> timestampcol timestamp without time zone,
>>>>> column49 timestamp without time zone,
>>>>> column50 timestamp without time zone,
>>>>> column51 timestamp without time zone,
>>>>> column52 timestamp without time zone,
>>>>> archivestatus integer,
>>>>> column54 integer,
>>>>> column55 character varying(20) COLLATE pg_catalog."default",
>>>>> CONSTRAINT pkey PRIMARY KEY (key)
>>>>> USING INDEX TABLESPACE tablespace
>>>>> )
>>>>>
>>>>> TABLESPACE tablespace;
>>>>>
>>>>> ALTER TABLE schema.logtable
>>>>> OWNER to user;
>>>>>
>>>>> CREATE INDEX idx_timestampcol
>>>>> ON schema.logtable USING btree
>>>>> ( timestampcol ASC NULLS LAST )
>>>>> TABLESPACE tablespace ;
>>>>>
>>>>> CREATE INDEX idx_test2
>>>>> ON schema.logtable USING btree
>>>>> ( entrytype ASC NULLS LAST)
>>>>> TABLESPACE tablespace
>>>>> WHERE archivestatus <= 1;
>>>>>
>>>>> CREATE INDEX idx_arcstatus
>>>>> ON schema.logtable USING btree
>>>>> ( archivestatus ASC NULLS LAST)
>>>>> TABLESPACE tablespace;
>>>>>
>>>>> CREATE INDEX idx_entrytype
>>>>> ON schema.logtable USING btree
>>>>> ( entrytype ASC NULLS LAST)
>>>>> TABLESPACE tablespace ;
>>>>>
>>>>>
>>>>> The table contains 14.000.000 entries and has about 3.3 GB of data:
>>>>> No triggers, inserts per day, probably 5-20 K per day.
>>>>>
>>>>> SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts,
>>>>> relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE
>>>>> relname='logtable';
>>>>>
>>>>> relname
>>>>> |relpages|reltuples|relallvisible|relkind|relnatts|relhassubclass|reloptions|pg_table_size|
>>>>>
>>>>> ------------------|--------|---------|-------------|-------|--------|--------------|----------|-------------|
>>>>> logtable | 405988| 14091424| 405907|r |
>>>>> 54|false |NULL | 3326803968|
>>>>>
>>>>>
>>>>> The slow running query:
>>>>>
>>>>> SELECT column1,..., column54 where ((entrytype = 4000 or entrytype =
>>>>> 4001 or entrytype = 4002) and (archivestatus <= 1)) order by timestampcol
>>>>> desc;
>>>>>
>>>>>
>>>>> This query runs in about 45-60 seconds.
>>>>> The same query runs in about 289 ms Oracle and 423 ms in SQL-Server.
>>>>> Now I understand that actually loading all results would take a while.
>>>>> (about 520K or so rows)
>>>>> But that shouldn't be exactly what happens right? There should be a
>>>>> resultset iterator which can retrieve all data but doesn't from the get go.
>>>>>
>>>>> With the help of some people in the slack and so thread, I've found a
>>>>> configuration parameter which helps performance :
>>>>>
>>>>> set random_page_cost = 1;
>>>>>
>>>>> This improved performance from 45-60 s to 15-35 s. (since we are using
>>>>> ssd's)
>>>>> Still not acceptable but definitely an improvement.
>>>>> Some maybe relevant system parameters:
>>>>>
>>>>> effective_cache_size 4GB
>>>>> maintenance_work_mem 1GB
>>>>> shared_buffers 2GB
>>>>> work_mem 1GB
>>>>>
>>>>>
>>>>> Currently I'm accessing the data through DbBeaver (JDBC -
>>>>> postgresql-42.2.5.jar) and our JAVA application (JDBC -
>>>>> postgresql-42.2.19.jar). Both use the defaultRowFetchSize=5000 to not load
>>>>> everything into memory and limit the results.
>>>>> The explain plan:
>>>>>
>>>>> EXPLAIN (ANALYZE, BUFFERS, SETTINGS, VERBOSE)...
>>>>> (Above Query)
>>>>>
>>>>>
>>>>> Gather Merge (cost=347142.71..397196.91 rows=429006 width=2558)
>>>>> (actual time=21210.019..22319.444 rows=515841 loops=1)
>>>>> Output: column1, .. , column54
>>>>> Workers Planned: 2
>>>>> Workers Launched: 2
>>>>> Buffers: shared hit=141487 read=153489
>>>>> -> Sort (cost=346142.69..346678.95 rows=214503 width=2558) (actual
>>>>> time=21148.887..21297.428 rows=171947 loops=3)
>>>>> Output: column1, .. , column54
>>>>> Sort Key: logtable.timestampcol DESC
>>>>> Sort Method: quicksort Memory: 62180kB
>>>>> Worker 0: Sort Method: quicksort Memory: 56969kB
>>>>> Worker 1: Sort Method: quicksort Memory: 56837kB
>>>>> Buffers: shared hit=141487 read=153489
>>>>> Worker 0: actual time=21129.973..21296.839 rows=166776 loops=1
>>>>> Buffers: shared hit=45558 read=49514
>>>>> Worker 1: actual time=21114.439..21268.117 rows=165896 loops=1
>>>>> Buffers: shared hit=45104 read=49506
>>>>> -> Parallel Bitmap Heap Scan on schema.logtable
>>>>> (cost=5652.74..327147.77 rows=214503 width=2558) (actual
>>>>> time=1304.813..20637.462 rows=171947 loops=3)
>>>>> Output: column1, .. , column54
>>>>> Recheck Cond: ((logtable.entrytype = 4000) OR
>>>>> (logtable.entrytype = 4001) OR (logtable.entrytype = 4002))
>>>>> Filter: (logtable.archivestatus <= 1)
>>>>> Heap Blocks: exact=103962
>>>>> Buffers: shared hit=141473 read=153489
>>>>> Worker 0: actual time=1280.472..20638.620 rows=166776
>>>>> loops=1
>>>>> Buffers: shared hit=45551 read=49514
>>>>> Worker 1: actual time=1275.274..20626.219 rows=165896
>>>>> loops=1
>>>>> Buffers: shared hit=45097 read=49506
>>>>> -> BitmapOr (cost=5652.74..5652.74 rows=520443
>>>>> width=0) (actual time=1179.438..1179.438 rows=0 loops=1)
>>>>> Buffers: shared hit=9 read=1323
>>>>> -> Bitmap Index Scan on idx_entrytype
>>>>> (cost=0.00..556.61 rows=54957 width=0) (actual time=161.939..161.940
>>>>> rows=65970 loops=1)
>>>>> Index Cond: (logtable.entrytype = 4000)
>>>>> Buffers: shared hit=1 read=171
>>>>> -> Bitmap Index Scan on idx_entrytype
>>>>> (cost=0.00..2243.22 rows=221705 width=0) (actual time=548.849..548.849
>>>>> rows=224945 loops=1)
>>>>> Index Cond: (logtable.entrytype = 4001)
>>>>> Buffers: shared hit=4 read=576
>>>>> -> Bitmap Index Scan on idx_entrytype
>>>>> (cost=0.00..2466.80 rows=243782 width=0) (actual time=468.637..468.637
>>>>> rows=224926 loops=1)
>>>>> Index Cond: (logtable.entrytype = 4002)
>>>>> Buffers: shared hit=4 read=576
>>>>> Settings: random_page_cost = '1', search_path = '"$user", schema,
>>>>> public', temp_buffers = '80MB', work_mem = '1GB'
>>>>> Planning Time: 0.578 ms
>>>>> Execution Time: 22617.351 ms
>>>>>
>>>>> As mentioned before, oracle does this much faster.
>>>>>
>>>>>
>>>>> -------------------------------------------------------------------------------------------------------------------------
>>>>> | Id | Operation | Name
>>>>> | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
>>>>>
>>>>> -------------------------------------------------------------------------------------------------------------------------
>>>>> | 0 | SELECT STATEMENT |
>>>>> | 6878 | 2491K| | 2143 (1)| 00:00:01 |
>>>>> | 1 | SORT ORDER BY |
>>>>> | 6878 | 2491K| 3448K| 2143 (1)| 00:00:01 |
>>>>> | 2 | INLIST ITERATOR |
>>>>> | | | | | |
>>>>> |* 3 | TABLE ACCESS BY INDEX ROWID BATCHED| logtable
>>>>> | 6878 | 2491K| | 1597 (1)| 00:00:01 |
>>>>> |* 4 | INDEX RANGE SCAN | idx_entrytype
>>>>> | 6878 | | | 23 (0)| 00:00:01 |
>>>>>
>>>>> -------------------------------------------------------------------------------------------------------------------------
>>>>>
>>>>> Is there much I can analyze, any information you might need to further
>>>>> analyze this?
>>>>>
>>>>
>>
>> --
>> Thanks,
>> Vijay
>> Mumbai, India
>>
> --
Thanks,
Vijay
Mumbai, India

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Rick Otten 2021-05-07 12:50:08 Re: Very slow Query compared to Oracle / SQL - Server
Previous Message Semen Yefimenko 2021-05-07 09:02:48 Re: Very slow Query compared to Oracle / SQL - Server