From: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
---|---|
To: | pg(at)bowt(dot)ie |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, bhushan(dot)uparkar(at)gmail(dot)com, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> |
Subject: | Re: Index Skip Scan |
Date: | 2018-09-10 21:47:06 |
Message-ID: | CA+q6zcV-wgnSMu-YOEmyXVckncQyriiKtUZXm=Gx2s+FV=5MDw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On Mon, 18 Jun 2018 at 17:26, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com> wrote:
>
> I took Thomas Munro's previous patch [2] on the subject, added a GUC, a
> test case, documentation hooks, minor code cleanups, and made the patch
> pass an --enable-cassert make check-world run. So, the overall design is
> the same.
I've looked through the patch more closely, and have a few questions:
* Is there any reason why only copy function for the IndexOnlyScan node
includes an implementation for distinctPrefix? Without read/out functionality
skip doesn't work for parallel scans, so it becomes like that:
=# SET force_parallel_mode TO ON;
=# EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT DISTINCT b FROM t1;
QUERY PLAN
-------------------------------------------------------------------------------
Gather (cost=1000.43..1001.60 rows=3 width=4)
(actual time=11.054..17672.010 rows=10000000 loops=1)
Output: b
Workers Planned: 1
Workers Launched: 1
Single Copy: true
Buffers: shared hit=91035 read=167369
-> Index Skip Scan using idx_t1_b on public.t1
(cost=0.43..1.30 rows=3 width=4)
(actual time=1.350..16065.165 rows=10000000 loops=1)
Output: b
Heap Fetches: 10000000
Buffers: shared hit=91035 read=167369
Worker 0: actual time=1.350..16065.165 rows=10000000 loops=1
Buffers: shared hit=91035 read=167369
Planning Time: 0.394 ms
Execution Time: 6037.800 ms
and with this functionality it gets better:
=# SET force_parallel_mode TO ON;
=# EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT DISTINCT b FROM t1;
QUERY PLAN
-------------------------------------------------------------------------------
Gather (cost=1000.43..1001.60 rows=3 width=4)
(actual time=3.564..4.427 rows=3 loops=1)
Output: b
Workers Planned: 1
Workers Launched: 1
Single Copy: true
Buffers: shared hit=4 read=10
-> Index Skip Scan using idx_t1_b on public.t1
(cost=0.43..1.30 rows=3 width=4)
(actual time=0.065..0.133 rows=3 loops=1)
Output: b
Heap Fetches: 3
Buffers: shared hit=4 read=10
Worker 0: actual time=0.065..0.133 rows=3 loops=1
Buffers: shared hit=4 read=10
Planning Time: 1.724 ms
Execution Time: 4.522 ms
* What is the purpose of HeapFetches? I don't see any usage of this variable
except assigning 0 to it once.
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2018-09-10 22:12:36 | Re: [HACKERS] PATCH: multivariate histograms and MCV lists |
Previous Message | Alvaro Herrera | 2018-09-10 18:37:39 | Re: can commitfest application resend a mail? |