Quick Links

Re: Async execution of postgres_fdw.

From:	Matt Kelly <mkellycs(at)gmail(dot)com>
To:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc:	ashutosh(dot)bapat(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Async execution of postgres_fdw.
Date:	2015-01-21 05:22:34
Message-ID:	CA+KcUkg4cvDLf4v0M9_rVv_ZuAsG1oDHPj_YvczJa6w2nSkwNQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I'm trying to compare v5 and v6 in my laptop right now. Apparently my
laptop is quite a bit faster than your machine because the tests complete
in roughly 3.3 seconds.

I added more data and didn't see anything other than noise. (Then again
the queries were dominated by the disk sort so I should retry with larger
work_mem). I'll try it again when I have more time to play with it. I
suspect the benefits would be more clear over a network.

Larger than default work_mem yes, but I think one of the prime use case for
the fdw is for more warehouse style situations (PostgresXL style use
cases). In those cases, work_mem might reasonably be set to 1GB. Then
even if you have 10KB rows you can fetch a million rows and still be using
less than work_mem. A simpler change would be to vary it with respect to
work_mem.

Half baked idea: I know its the wrong time in the execution phase, but if
you are using remote estimates for cost there should also be a row width
estimate which I believe is based from pg_statistic and its mean column
width.

Its actually a pity that there is no way to set fetch sizes based on "give
me as many tuples as will fit in less than x amount of memory". Because
that is almost always exactly what you want. Even when writing application
code, I've never actually wanted precisely 10,000 rows; I've always wanted
"a reasonable size chunk that could fit into memory" and then backed my way
into how many rows I wanted. If we were to extend FETCH to support syntax
like: FETCH FORWARD '10MB' FROM ...; then we would eliminate the need
estimate the value on the fly.

The async stuff, however, is a huge improvement over the last time I played
with the fdw. The two active postgres processes were easily consuming a
core and half of CPU. I think its not worth tying these two things
together. Its probably worth it to make these two separate discussions and
separate patches.

- Matt Kelly

*Just sanity checking myself: Shutting down the server, applying the
different patch, 'make clean install' in postgres_fdw, and then restarting
the server should obviously be sufficient to make sure its running the new
code because that is all linked at runtime, right?

In response to

Re: Async execution of postgres_fdw. at 2015-01-20 08:47:13 from Kyotaro HORIGUCHI

Responses

Re: Async execution of postgres_fdw. at 2015-01-21 07:24:48 from Kyotaro HORIGUCHI

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2015-01-21 05:43:53	Re: [Pgbuildfarm-members] [HACKERS] Reducing buildfarm disk usage: remove temp installs when done
Previous Message	Michael Paquier	2015-01-21 05:14:45	Re: Dereferenced pointers checked as NULL in btree_utils_var.c