From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | David Fetter <david(at)fetter(dot)org> |
Cc: | Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: sql/med review - problems with patching |
Date: | 2010-07-20 15:28:00 |
Message-ID: | AANLkTilfzj08sgvGWDd4HhjDeo5A0ErdixPP3_H44oyQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2010/7/20 David Fetter <david(at)fetter(dot)org>:
> On Tue, Jul 20, 2010 at 11:40:18AM +0200, Pavel Stehule wrote:
>> 2010/7/20 Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>:
>> > 2010/7/14 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>> >> please, can you refresh patch, please?
>> >
>> > Updated patch attached. The latest version is always in the git
>> > repo. http://repo.or.cz/w/pgsql-fdw.git (branch: fdw) I'm
>> > developing the patch on postgres' git repo. So, regression test
>> > for dblink might fail because of out-of-sync issue between cvs and
>> > git.
>> >
>> >> When I looked to documentation I miss a some tutorial for foreign
>> >> tables. There are only reference. I miss some paragraph where is
>> >> cleanly and simple specified what is possible now and whot isn't
>> >> possible. Enhancing of dblink isn't documented
>> >
>> > Sure. I'll start to write documentation when we agree the design
>> > of FDW.
>> >
>> >> In function pgIterate(ForeignScanState *scanstate) you are
>> >> iterare via pg result. I am thinking so using a cursor and
>> >> fetching multiple rows should be preferable.
>> >
>> > Sure, but I'm thinking that it will be improved after libpq
>> > supports protocol-level cursor. The libpq improvement will be
>> > applied much more applications including postgresql_fdw.
>> >
>>
>> is there some time frame for this task - or ToDo point? Minimally it
>> has to be documented, because it can be a issue on larger sets -
>> speed, memory usage. I am afraid about speed for queries like
>>
>> select * from large_dblink_tab limit 100;
>
> The general issue of passing qualifiers to the remote data source is
> complex, especially when the DML for that data source is different
> from PostgreSQL's DML.
>
> Do you have some ideas as to how to solve this problem in general? In
> this case?
yes, I can. I expect so you can read from foreign table row by row,
because it is supported by executor - via ExecForeignScan().
dblink exec is called inside this method and result is stored.
Repeated call of this method means repeated reading from tuplestore
storage. So outer limit hasn't effect on loaded data - full result is
fetched. I propose little bit different strategy. Using a cursor and
repeating fetching of n rows. Plpgsql "FOR" statement uses 50 rows.
For external tables 1000 can be enough. With this strategy max (n - 1)
rows are fetched uselessly. So we don't need a local tuplestore and we
will have a result early - with LIMIT clause.
so
ExecInitForeignScan()
--> open cursor
ExecForeignScan()
--> if there are some fetched rows, return row
--> if not, fetch 1000 rows
ExecEndScan()
--> close cursor
Regards
Pavel Stehule
same mechanism is used in plpgsql.
>
> Cheers,
> David.
> --
> David Fetter <david(at)fetter(dot)org> http://fetter.org/
> Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
> Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2010-07-20 15:28:06 | Re: reducing NUMERIC size for 9.1 |
Previous Message | David Christensen | 2010-07-20 15:23:18 | Re: Explicit psqlrc |