minimizing the target list for foreign data wrappers

From: David Gudeman <dave(dot)gudeman(at)gmail(dot)com>
To: Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject: minimizing the target list for foreign data wrappers
Date: 2013-04-22 01:57:06
Message-ID: CAE4YsyjkkgdaUxRHfUAXBfQx0vdw9OE55Yu-mTSJtB25VYXgzA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

A few years ago I wrote a roll-your-own foreign-data-wrapper system for
Postgres because Postgres didn't have one at the time (some details
here<http://unubtainabol.blogspot.com/2013/04/daves-foreign-data-introuction.html>if
anyone is interested). Now I'm being tasked to move it to Postgres
9.2.x
and I'd like to use FDW if possible.

One of the problems I'm having is that in my application, the foreign
tables typically have hundreds of columns while typical queries only access
a dozen or so (the foreign server is a columnar SQL database). Furthermore,
there is no size optimization for NULL values passed back from the foreign
server, so if I return all of the columns from the table --even as NULLs--
the returned data size will be several times the size that it needs to be.
My application cannot tolerate this level of inefficiency, so I need to
return minimal columns from the foreign table.

The documentation doesn't say how to do this, but looking at the code I
think it is possible. In GetForeignPlan() you have to pass on the tlist
argument, which I presume means that the query plan will use the tlist that
I pass in, right? If so, then it should be possible for me to write a
function that takes tlist and baserel->reltargetlist and return a version
of tlist that knows which foreign-table columns are actually used, and
replaces the rest with a NULL constant.

For example, suppose the original tlist is this: [VAR(attrno=1),
VAR(attrno=2), VAR(attrno=3)] and reltarget list says that I only need args
1 and 3. Then the new tlist would look like this: [VAR(attrno=1),
CONST(val=NULL), VAR(attrno=2)] where the attrno of the last VAR has been
reduced by one because the 2 column is no longer there.

I did something very much like this in my roll-your-own version of FDW so I
know basically how to do it, but I did it at the pre-planning stage and I'm
not sure how much is already packed into the other plan nodes at this
point. Maybe it's too late to change the target list?

Can anyone give me some advice or warnings on this? I'd hate to go to the
trouble of implementing and testing it only to find that I'm making some
bogus assumptions.

Thanks,
David Gudeman

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Boszormenyi Zoltan 2013-04-22 06:11:21 Re: 9.3 Beta1 status report
Previous Message Michael Paquier 2013-04-22 00:20:25 Re: Recovery target 'immediate'