Working with very large datasets

From: Wilkinson Charlie E <Charlie(dot)E(dot)Wilkinson(at)irs(dot)gov>
To: pgsql-sql(at)postgresql(dot)org
Subject: Working with very large datasets
Date: 2003-02-11 23:42:45
Message-ID: 9DDDC002B736D711A7320004AC3699EB01C192C7@nct0010mb03.nc.no.irs.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Greetings,
Can anyone enlighten me or point me at resources concerning use of pgsql
with
very large datasets?

My specific problem is this:

I have two tables, one with about 100 million rows and one with about 22,000
rows. My plan was to inner join the two tables on an integer key and output
the 4 significant columns, excluding the keys. (Those with a better
understanding
of pgsql internals, feel free to laugh.) The result was a big angry psql
that
grew to 800+MB before I had to kill it.

Example:

CREATE TABLE users (
userid INTEGER NOT NULL,
fileid INTEGER NOT NULL
);
CREATE INDEX "users_userid_key" on "users" using btree ( "userid" "int4_ops"
);

CREATE TABLE files
(
fileid SERIAL NOT NULL,
name VARCHAR(120),
size INTEGER,
mtime INTEGER
);
-- files.fileid will automagically get indexed

[100 million inserts to 'users', 22,000 inserts to 'files'...]

SELECT userid,name,size,mtime FROM users INNER JOIN files ON users.fileid =
files.fileid;

[attack of the 50 foot psql...]

Did I miss sonething important? Is there a better way? Any clues
appreciated.

-cw-

(Apologies for any HTML that gets tacked onto this message. I spec'd
plaintext!)
--
Charlie Wilkinson
TRIS Development Systems Administrator
IS:SD:CT:CC:TD
Phone: 202-283-3241
MSMail: Charlie(dot)E(dot)Wilkinson(at)irs(dot)gov
SMTP: cwilkins(at)tris(dot)irs(dot)gov
Home: cwilkins(at)boinklabs(dot)com
This message constructed from 90% post-consumer electrons.

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Susan 2003-02-11 23:44:50 adding not null constraints on columns
Previous Message Wei Weng 2003-02-11 23:28:38 What is wrong with this identification configuration?