Re: TABLESAMPLE patch is really in pretty sad shape

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TABLESAMPLE patch is really in pretty sad shape
Date: 2015-07-25 01:24:58
Message-ID: 18120.1437787498@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Petr Jelinek <petr(at)2ndquadrant(dot)com> writes:
> I was wondering if we should perhaps cache the output of GetTsmRoutine
> as we call it up to 4 times in the planner now but it's relatively cheap
> call (basically just makeNode) so it's probably not worth it.

Yeah, I was wondering about that too. The way to do it would probably
be to add a TsmRoutine pointer to RelOptInfo. I'm not concerned at all
about the makeNode/fill-the-struct cost, but the syscache lookup involved
in getting from the function OID to the function might be worth worrying
about. As things stand, it didn't quite seem worth the trouble, but if
we add any more planner lookups of the TsmRoutine then I'd want to do it.

Another place for future improvement is to store the sample-size outputs
separately in RelOptInfo instead of overwriting pages/tuples. I'm not
sure it's worth the complication right now, but if we ever support doing
sampling with more than one scan plan type (eg bernoulli filtering in
an indexscan), we'd pretty much have to do that in order to be able to
compute costs sanely.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2015-07-25 01:33:35 Re: pg_dump -Fd and compression level
Previous Message Peter Eisentraut 2015-07-25 01:14:09 Re: MultiXact member wraparound protections are now enabled