From: | Joseph Turner <joseph(dot)turner(at)oakleynetworks(dot)com> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Selecting "sample" data from large tables. |
Date: | 2004-06-03 17:31:22 |
Message-ID: | 200406031131.24535.joseph.turner@oakleynetworks.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I have a table with a decent number of rows (let's say for example a
billion rows). I am trying to construct a graph that displays the
distribution of that data. However, I don't want to read in the
complete data set (as reading a billion rows would take a while). Can
anyone thing of a way to do this is postgresql? I've been looking
online and most of the stuff I've found has been for other databases.
As far as I can tell ANSI SQL doesn't provide for this scenario.
I could potentially write a function to do this, however I'd prefer
not to. But if that's what I'm going to be stuck doing I'd like to
know earlier then later. Here's the description of the table:
create table score
{
pageId Integer NOT NULL,
ruleId, Integer NOT NULL
score Double precision NULL,
rowAddedDate BigInt NULL,
primary key (pageId, ruleId)
};
I also have an index on row added date, which is just the number of
millis since the epoc (Jan 1, 1970 or so [java style timestamps]).
I'd be willing to accept that the row added date values are random
enough to represent random.
Thanks in advance,
-- Joe T.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
iD8DBQFAv2Bqs/P36Z9SDAARAkmLAJ9dDB0sqACgFrxH8NukFUsizXz5zgCgt9IT
/wh3ryz4WQzc5qQY2cAZtVE=
=5dg+
-----END PGP SIGNATURE-----
From | Date | Subject | |
---|---|---|---|
Next Message | elein | 2004-06-03 18:06:57 | Re: [SQL] SQL Spec Compliance Questions |
Previous Message | Bruno Wolff III | 2004-06-03 17:31:01 | Re: ORDER BY TIMESTAMP_column ASC, NULL first |