| From: | <hohenstein(at)cs(dot)uni-kl(dot)de> |
|---|---|
| To: | <pgsql-general(at)lists(dot)postgresql(dot)org> |
| Subject: | Using the indexing and sampling APIs to realize progressive features |
| Date: | 2022-02-03 15:24:54 |
| Message-ID: | 000a01d81912$45d2d3a0$d1787ae0$@cs.uni-kl.de |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
Hi,
I have some questions regarding the indexing and sampling API.
My aim is to implement a variant of progressive indexing as seen in this
paper (link
<chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/http:/www.vldb.org/pvld
b/vol12/p2366-holanda.pdf> ). To summarize,
I want to implement a variant of online aggregation, where an aggregate
query (Like Sum, Average, etc.) is answered in real time, where the result
becomes more and more accurate as Tuples are consumed.
I thought that I could maybe use a custom sampling routine to consume table
samples until I have seen the whole table with no duplicate tuples.
Meanwhile, with every consumed sample and returned partial answer, I want to
add the tuples consumed to a progressively evolving index.
This would mean that I would have to be able to uniquely identify each row
to be able to add them to the growing index, right? Since OID is deprecated
/ phased out, I am still unsure of how to solve this.
Does this sound reasonable or is there an obvious flaw in my thinking?
I would also be thankful if there is any material beyond the Postgres
documentation which helps me to start out modifying the source to realize
something like this.
Regards
Michael H.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Shaozhong SHI | 2022-02-03 15:32:15 | Re: Can Postgres beat Oracle for regexp_count? |
| Previous Message | Laurenz Albe | 2022-02-03 15:10:59 | Re: Oracle to postgresql migration |