Improving DISTINCT with LooseScan node

From: Dmitriy Sarafannikov <dsarafannikov(at)yandex(dot)ru>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Improving DISTINCT with LooseScan node
Date: 2017-09-17 17:43:09
Message-ID: 110251505670189@web3o.yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

<div>Hi hackers,</div><div> </div><div>Everybody knows, that we have <u>unefficient execution of query like "SELECT DISTINCT id from mytable"</u></div><div><u>if table has many-many rows and only several unique id values. Query plan looks like Unique + IndexScan.</u></div><div> </div><div><u>I have tried to implement this feature in new type of node called Loose Scan.</u></div><div><u>This node must appears in plan together with IndexScan or IndexOnlyScan just like Unique node in this case.</u></div><div><u>But instead of filtering rows with equal values, LooseScan must retreive first row from IndexScan,</u></div><div><u>then remember and return this. With all subsequent calls LooseScan must initiate calling index_rescan via ExecReScan</u></div><div><u>with search value that &gt; or &lt; (depending on scan direction) of previous value.</u></div><div>Total cost of this path must be something like total_cost = n_distinct_values * subpath-&gt;startup_cost</div><div>What do you think about this idea?</div><div> </div><div>I was able to create new LooseScan node, but i ran into difficulties with changing scan keys.</div><div>I looked (for example) on the ExecReScanIndexOnlyScan function and I find it difficult to understand</div><div>how i can reach the ioss_ScanKeys through the state of executor. Can you help me with this?</div><div> </div><div>I also looked on the Nested Loop node, which as i think must change scan keys, but i didn't become clear for me.</div><div>The only thought that came to my head, that maybe i incorrectly create this subplan.</div><div>I create it just like create_upper_unique_plan, and create subplan for IndexScan via create_plan_recurse.</div><div>Can you tell me where to look or maybe somewhere there are examples?</div><div> </div><div>Thanks</div><div> </div><div>Regards,</div><div>Dmitriy Sarafannikov</div>

Attachment Content-Type Size
unknown_filename text/html 1.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2017-09-17 18:18:54 Re: [PATCH] Overestimated filter cost and its mitigation
Previous Message Jeff Davis 2017-09-17 17:24:34 Re: Range Merge Join v1