Quick Links

Re: Gsoc2012 idea, tablesample

From:	Florian Pflug <fgp(at)phlo(dot)org>
To:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	<josh(at)agliodbs(dot)com>, <andres(at)anarazel(dot)de>, <alvherre(at)commandprompt(dot)com>, <ants(at)cybertec(dot)at>, <heikki(dot)linnakangas(at)enterprisedb(dot)com>, <cbbrowne(at)gmail(dot)com>, <neil(dot)conway(at)gmail(dot)com>, <robertmhaas(at)gmail(dot)com>, <daniel(at)heroku(dot)com>, <huangqiyx(at)hotmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, <sfrost(at)snowman(dot)net>
Subject:	Re: Gsoc2012 idea, tablesample
Date:	2012-05-11 15:55:24
Message-ID:	9C5FE8E4-488B-4598-B5F6-A70AF520B6AD@phlo.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On May11, 2012, at 16:03 , Kevin Grittner wrote:
>> [more complex alternatives]
>
> I really think your first suggestion covers it perfectly; these more
> complex techniques don't seem necessary to me.

The point of the more complex techniques (especially the algorithm in
my second mail, the "reply to self") was simply to optimize the generation
of a random, uniformly distributed, unique and sorted list of TIDs.

The basic idea is to make sure we generate the TIDs in physical order,
and thus automatically ensure that they are unique. The reduces the memory
(or disk) requirement to O(1) instead of O(n), and (more importantly,
actually) makes the actual implementation much simpler.

best regards,
Florian Pflug

In response to

Re: Gsoc2012 idea, tablesample at 2012-05-11 14:03:13 from Kevin Grittner

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fujii Masao	2012-05-11 16:13:09	Re: incorrect handling of the timeout in pg_receivexlog
Previous Message	Fujii Masao	2012-05-11 15:53:57	Re: Can pg_trgm handle non-alphanumeric characters?