Re: TABLESAMPLE patch is really in pretty sad shape

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TABLESAMPLE patch is really in pretty sad shape
Date: 2015-07-24 22:36:21
Message-ID: 14236.1437777381@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Petr Jelinek <petr(at)2ndquadrant(dot)com> writes:
>> The only major difference that I see so far and I'd like you to
>> incorporate that into your patch is that I renamed the SampleScanCost to
>> SampleScanGetRelSize because that reflects much better the use of it, it
>> isn't really used for costing, but for getting the pages and tuples of
>> the baserel.

> Good suggestion. I was feeling vaguely uncomfortable with that name as
> well, given what the functionality ended up being.

After further thought it seemed like the right name to use is
SampleScanGetSampleSize, so as to avoid confusion between the size of the
relation and the size of the sample. (The planner is internally not
making such a distinction right now, but that doesn't mean we should
propagate that fuzzy thinking into the API spec.)

Attached is a more-or-less-final version of the proposed patch.
Major changes since yesterday:

* I worked over the contrib modules and docs.

* I thought of a reasonably easy way to do something with nonrepeatable
sampling methods inside join queries: we can simply wrap the SampleScan
plan node in a Materialize node, which will guard it against being
executed more than once. There might be a few corner cases where this
doesn't work fully desirably, but in testing it seemed to do the right
thing (and I added some regression tests about that). This seems
certainly a better answer than either throwing an error or ignoring
the problem.

Last chance for objections ...

regards, tom lane

Attachment Content-Type Size
tsm-fixes-2.0.patch text/x-diff 304.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2015-07-24 22:41:41 Re: [PROPOSAL] VACUUM Progress Checker.
Previous Message Marc Mamin 2015-07-24 21:22:39 Re: pg_dump -Fd and compression level