From: | Gurmeet Manku <manku(at)CS(dot)Stanford(dot)EDU> |
---|---|
To: | Gurmeet Manku <manku(at)xenon(dot)Stanford(dot)EDU> |
Cc: | pgsql-perform <pgsql-performance(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Citation for "Bad n_distinct estimation; hacks suggested?" |
Date: | 2005-05-02 16:14:00 |
Message-ID: | Pine.LNX.4.44.0505020858580.28631-100000@xenon.Stanford.EDU |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Actually, the earliest paper that solves the distinct_n estimation
problem in 1 pass is the following:
"Estimating simple functions on the union of data streams"
by Gibbons and Tirthapura, SPAA 2001.
http://home.eng.iastate.edu/~snt/research/streaming.pdf
The above paper addresses a more difficult problem (1 pass
_and_ a distributed setting).
Gibbon's followup paper in VLDB 2001 limits the problem to a
single machine and contains primarily experimental results (for
a database audience). The algorithmic breakthrough had already been
accomplished in the SPAA paper.
Gurmeet
--
----------------------------------------------------
Gurmeet Singh Manku Google Inc.
http://www.cs.stanford.edu/~manku (650) 967 1890
----------------------------------------------------
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2005-05-02 16:48:07 | Re: [HACKERS] Increased company involvement |
Previous Message | Heikki Linnakangas | 2005-05-02 15:47:14 | Re: Feature freeze date for 8.1 |
From | Date | Subject | |
---|---|---|---|
Next Message | Chris Browne | 2005-05-02 16:16:42 | Re: batch inserts are "slow" |
Previous Message | David Parker | 2005-05-02 15:53:33 | Re: batch inserts are "slow" |