Re: FDW handling count(*) through AnalyzeForeignTable or other constant time push-down

From: "Gabe F(dot) Rudy" <rudy(at)goldenhelix(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FDW handling count(*) through AnalyzeForeignTable or other constant time push-down
Date: 2016-02-26 16:30:13
Message-ID: SN1PR17MB033474485224715BF6FB4B6ED4A70@SN1PR17MB0334.namprd17.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ok, I get that.

Really what I am *rooting* for is Aggregate (and Sort By) Push-Down to FDW plugins.

I can already internalize conditional filters for most cases, and doing a count on the filtered results would be considerably faster in my FDW back-end before all the records and Datums have to be constructed for postgres to do the counting.

Similarly, I'm very excited about the potential for FDW to advertise a-priori sort states, so things like external merge-sorts can pass-through the request for sorted data for fields in which sorting is a no-op in my backend.

Importantly my IDs are sorted by definition since they are essentially array indexes into the column-store, so joining on them with merge-sort should be blazing fast, but currently time is wasted sorting these pre-sorted fields.

Just my 2c, and I'll be tracking the 9.6 progress that includes some of these proposals.

Gabe

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Thursday, February 25, 2016 11:21 PM
To: Gabe F. Rudy <rudy(at)goldenhelix(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] FDW handling count(*) through AnalyzeForeignTable or other constant time push-down

"Gabe F. Rudy" <rudy(at)goldenhelix(dot)com> writes:
> Is there any way to convince Postgres FDW to leverage the analyze row counts or even the "double* totalRowCount" returned from the AcquireSampleRows callback from my AnalyzeForeignTable function so that it does not do a full-table scan for a COUNT(*) etc?

No. In PG's view, ANALYZE-based row counts are imprecise by definition.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-02-26 16:58:36 Re: The plan for FDW-based sharding
Previous Message Joshua D. Drake 2016-02-26 16:30:02 Re: The plan for FDW-based sharding