From: | Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> |
---|---|
To: | PgHacker <pgsql-hackers(at)postgresql(dot)org> |
Subject: | What's needed for cache-only table scan? |
Date: | 2013-11-12 14:45:43 |
Message-ID: | CADyhKSXzw=yON0egjLYwFB+y36wJWRY=UTxJdJ+gMuXsaG9=vg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
It is a brief design proposal of a feature I'd like to implement on top of
custom-scan APIs. Because it (probably) requires a few additional base
features not only custom-scan, I'd like to see feedback from the hackers.
The cache-only table scan, being in subject line, is an alternative scan
logic towards sequential scan if all the referenced columns are cached.
It shall allow to scan a particular table without storage access, thus
make scan performance improved.
So what? Which is difference from large shared_buffers configuration?
This mechanism intends to cache a part of columns being referenced
in the query, not whole of the records. It makes sense to the workloads
that scan a table with many columns but qualifier references just a few
columns, typically used to analytic queries, because it enables to
reduce memory consumption to be cached, thus more number of records
can be cached.
In addition, it has another role from my standpoint. It also performs as
fast data supplier towards GPU/MIC devices. When we move data to
GPU device, the source address has to be a region marked as "page-
locked" that is exempted from concurrent swap out, if we want CUDA
or OpenCL to run asynchronous DMA transfer mode; the fastest one.
Probably, here is no problem on construction of this table cache.
All we need to do is inject a custom-scan node instead of seq-scan,
then it can construct table cache in concurrence with regular seq-
scan, even though the first access become a little bit slow down.
My concern is how to handle a case when table gets modified.
A straightforward idea is that each cached entries being modified
shall be invalidated by callback mechanism.
Trigger can help in case of INSERT, UPDATE, DELETE and
TRUNCATE. Furthermore, it's better if extension could inject
its own trigger definition at RelationBuildTriggers() on the fly,
to perform the feature transparently.
On the other hand, we have no way to get control around VACUUM.
I want to have a hook that allows extensions to get control when
a page got vacuumed. Once we have such a hook, it enables to
invalidate cached entries being indexed by tid, but already vacuumed.
Its best location is probably lazy_scan_heap() to call back extension
for each page getting vacuumed, with
How about your opinion?
I'd like to find out the best way to implement this table-caching
mechanism within scope of v9.4 functionality set.
Any ideas, comments or suggestions are welcome.
Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2013-11-12 14:54:46 | Re: Clang 3.3 Analyzer Results |
Previous Message | Tom Lane | 2013-11-12 14:38:22 | Re: Clang 3.3 Analyzer Results |