Efficiency of timestamps

From: Martin Foster <martin(at)ethereal-realms(dot)org>
To: PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Efficiency of timestamps
Date: 2003-07-09 00:27:36
Message-ID: 3F0B6178.2020805@ethereal-realms.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

As I keep looking through code to see where I can make things more
efficient, I noticed that in some cases timestamps seem horribly
inefficient. This leads to very long run times for certain queries.

Here is an example:

-- USING TIMESTAMPS TO NARROW DOWN --

SELECT
Post.PostIDNumber,
Post.PuppeteerLogin,
Post.PuppetName,
Post.PostCmd,
Post.PostClass
FROM ethereal.Post
WHERE Post.PostTimeStamp > (LOCALTIMESTAMP - INTERVAL '10 Minutes')
AND Post.RealmName='Amalgam'
AND (Post.PostTo='all' OR Post.PostTo='root')
AND (NOT EXISTS (SELECT PuppetIgnore.PuppetLogin
FROM ethereal.PuppetIgnore
WHERE PuppetIgnore.PuppetIgnore='global'
AND PuppetIgnore.PuppeteerLogin='root'
AND PuppetIgnore.PuppetLogin=Post.PuppeteerLogin)
OR Post.PuppeteerLogin IS NULL)
AND (NOT EXISTS (SELECT PuppetIgnore.PuppetName
FROM ethereal.PuppetIgnore
WHERE PuppetIgnore='single'
AND PuppetIgnore.PuppeteerLogin='root'
AND PuppetIgnore.PuppetName=Post.PuppetName)
OR Post.PuppetName IS NULL)
ORDER BY Post.PostIDNumber LIMIT 100

-- Explain of Above--
Limit (cost=0.00..260237.32 rows=100 width=48)
-> Index Scan using pkpost on post (cost=0.00..3020594.00 rows=1161
width=48)
Filter: ((posttimestamp > (('now'::text)::timestamp(6) without
time zone - '00:10'::interval)) AND (realmname = 'Amalgam'::character
varying) AND ((postto = 'all'::character varying) OR (postto =
'root'::character varying)) AND ((NOT (subplan)) OR (puppeteerlogin IS
NULL)) AND ((NOT (subplan)) OR (puppetname IS NULL)))
SubPlan
-> Index Scan using pkpuppetignore on puppetignore
(cost=0.00..13.31 rows=1 width=10)
Index Cond: (puppeteerlogin = 'root'::character varying)
Filter: ((puppetignore = 'global'::character varying)
AND (puppetlogin = $0))
-> Index Scan using pkpuppetignore on puppetignore
(cost=0.00..5.84 rows=1 width=15)
Index Cond: ((puppeteerlogin = 'root'::character
varying) AND (puppetname = $1))
Filter: (puppetignore = 'single'::character varying)

Result : 22 rows fetched (17.21 sec)

-- USING A GENERATED ID NUMBER --

SELECT
Post.PostIDNumber,
Post.PuppeteerLogin,
Post.PuppetName,
Post.PostCmd,
Post.PostClass
FROM ethereal.Post
WHERE Post.PostIDNumber > 1
AND Post.RealmName='Amalgam'
AND (Post.PostTo='all' OR Post.PostTo='root')
AND (NOT EXISTS (SELECT PuppetIgnore.PuppetLogin
FROM ethereal.PuppetIgnore
WHERE PuppetIgnore.PuppetIgnore='global'
AND PuppetIgnore.PuppeteerLogin='root'
AND PuppetIgnore.PuppetLogin=Post.PuppeteerLogin)
OR Post.PuppeteerLogin IS NULL)
AND (NOT EXISTS (SELECT PuppetIgnore.PuppetName
FROM ethereal.PuppetIgnore
WHERE PuppetIgnore='single'
AND PuppetIgnore.PuppeteerLogin='root'
AND PuppetIgnore.PuppetName=Post.PuppetName)
OR Post.PuppetName IS NULL)
ORDER BY Post.PostIDNumber LIMIT 100

-- Explain of Above--
Limit (cost=0.00..86712.10 rows=100 width=48)
-> Index Scan using pkpost on post (cost=0.00..3019119.56 rows=3482
width=48)
Index Cond: (postidnumber > 1)
Filter: ((realmname = 'Amalgam'::character varying) AND
((postto = 'all'::character varying) OR (postto = 'root'::character
varying)) AND ((NOT (subplan)) OR (puppeteerlogin IS NULL)) AND ((NOT
(subplan)) OR (puppetname IS NULL)))
SubPlan
-> Index Scan using pkpuppetignore on puppetignore
(cost=0.00..13.31 rows=1 width=10)
Index Cond: (puppeteerlogin = 'root'::character varying)
Filter: ((puppetignore = 'global'::character varying)
AND (puppetlogin = $0))
-> Index Scan using pkpuppetignore on puppetignore
(cost=0.00..5.84 rows=1 width=15)
Index Cond: ((puppeteerlogin = 'root'::character
varying) AND (puppetname = $1))
Filter: (puppetignore = 'single'::character varying)

Result : 100 rows fetched ( 0.19 sec)

-- USING A MIXTURE OF BOTH --

SELECT
Post.PostIDNumber,
Post.PuppeteerLogin,
Post.PuppetName,
Post.PostCmd,
Post.PostClass
FROM ethereal.Post
WHERE Post.PostIDNumber > (SELECT MIN(PostIDNumber)
FROM ethereal.Post
WHERE Post.PostTimeStamp > (LOCALTIMESTAMP - INTERVAL '10 minutes'))::INT
AND Post.RealmName='Amalgam'
AND (Post.PostTo='all' OR Post.PostTo='root')
AND (NOT EXISTS (SELECT PuppetIgnore.PuppetLogin
FROM ethereal.PuppetIgnore
WHERE PuppetIgnore.PuppetIgnore='global'
AND PuppetIgnore.PuppeteerLogin='root'
AND PuppetIgnore.PuppetLogin=Post.PuppeteerLogin)
OR Post.PuppeteerLogin IS NULL)
AND (NOT EXISTS (SELECT PuppetIgnore.PuppetName
FROM ethereal.PuppetIgnore
WHERE PuppetIgnore='single'
AND PuppetIgnore.PuppeteerLogin='root'
AND PuppetIgnore.PuppetName=Post.PuppetName)
OR Post.PuppetName IS NULL)
ORDER BY Post.PostIDNumber LIMIT 100

-- Explain of Above--
Limit (cost=0.00..87101.38 rows=100 width=48)
InitPlan
-> Aggregate (cost=12412.82..12412.82 rows=1 width=4)
-> Index Scan using idxpost_timestamp on post
(cost=0.00..12282.42 rows=52160 width=4)
Index Cond: (posttimestamp >
(('now'::text)::timestamp(6) without time zone - '00:10'::interval))
-> Index Scan using pkpost on post (cost=0.00..1010992.25 rows=1161
width=48)
Index Cond: (postidnumber > $0)
Filter: ((realmname = 'Amalgam'::character varying) AND
((postto = 'all'::character varying) OR (postto = 'root'::character
varying)) AND ((NOT (subplan)) OR (puppeteerlogin IS NULL)) AND ((NOT
(subplan)) OR (puppetname IS NULL)))
SubPlan
-> Index Scan using pkpuppetignore on puppetignore
(cost=0.00..13.31 rows=1 width=10)
Index Cond: (puppeteerlogin = 'root'::character varying)
Filter: ((puppetignore = 'global'::character varying)
AND (puppetlogin = $1))
-> Index Scan using pkpuppetignore on puppetignore
(cost=0.00..5.84 rows=1 width=15)
Index Cond: ((puppeteerlogin = 'root'::character
varying) AND (puppetname = $2))
Filter: (puppetignore = 'single'::character varying)

Result : 18 rows fetched ( 0.04 sec)

Both PostIDNumber and PostTimestamp are indexed, so that should not be a
bottleneck in itself. However, as you can see in the third example
the use of a sub-query actually accelerates the process considerably,
meaning that integer based searching is much much faster.

Under MySQL timestamps where in Unix time, which is why I may have never
noticed such an extreme slowdown when doing similar on that script. Of
course to boggle the mind, here is a view that works very well:

CREATE VIEW ethereal.Who AS
SELECT
Po.PuppetName AS PuppetName,
Po.PuppeteerLogin AS PuppeteerLogin,
Po.RealmName AS RealmName,
Re.RealmPublic AS RealmPublic,
Re.RealmVerified AS RealmVerified
FROM ethereal.Post Po, ethereal.Puppet Ch, ethereal.Realm Re
WHERE Po.PuppeteerLogin = Ch.PuppeteerLogin
AND Po.RealmName = Re.RealmName
AND Po.PostTimestamp > (LOCALTIMESTAMP - INTERVAL '10 minutes')
AND Po.PuppetName IS NOT NULL
GROUP BY Po.PuppeteerLogin, Po.PuppetName, Po.RealmName,
Re.RealmPublic, Re.RealmVerified
ORDER BY Po.RealmName, Po.PuppetName;

Sort (cost=309259.89..309629.34 rows=147780 width=79)
Sort Key: po.realmname, po.puppetname
-> Group (cost=270648.27..292815.19 rows=147780 width=79)
-> Sort (cost=270648.27..274342.75 rows=1477795 width=79)
Sort Key: po.puppeteerlogin, po.puppetname, po.realmname,
re.realmpublic, re.realmverified
-> Merge Join (cost=22181.60..41087.65 rows=1477795
width=79)
Merge Cond: ("outer".puppeteerlogin =
"inner".puppeteerlogin)
-> Sort (cost=17172.82..17300.26 rows=50978 width=69)
Sort Key: po.puppeteerlogin
-> Hash Join (cost=12.41..13186.95
rows=50978 width=69)
Hash Cond: ("outer".realmname =
"inner".realmname)
-> Index Scan using idxpost_timestamp
on post po (cost=0.00..12282.42 rows=50978 width=42)
Index Cond: (posttimestamp >
(('now'::text)::timestamp(6) without time zone - '00:10'::interval))
Filter: (puppetname IS NOT NULL)
-> Hash (cost=11.93..11.93 rows=193
width=27)
-> Seq Scan on realm re
(cost=0.00..11.93 rows=193 width=27)
-> Sort (cost=5008.78..5100.22 rows=36574 width=10)
Sort Key: ch.puppeteerlogin
-> Seq Scan on puppet ch
(cost=0.00..2236.74 rows=36574 width=10)

Result : 48 rows fetched ( 0.55 sec)

It uses the exact same time restraint as the first three examples, looks
through the same table, does a tipple join and still gets off at higher
speeds. This seems to indicate that timestamps are actually efficient,
which contradicts above examples.

Any ideas? Code for the table creation is below signature:

Martin Foster
Creator/Designer Ethereal Realms
martin(at)ethereal-realms(dot)org

--
--
-- NAME : Post
-- REFERENCES : Realm*
-- Puppet*
-- PuppeteerLogin*
--
-- DESCRIPTION : Post is the hive of activity for all realms.
Associated with all three
-- major tables, it is not actually linked because of the
nature of storing
-- posts for statistics and auditing.

CREATE TABLE ethereal.Post (
PostIDNumber INT NOT NULL DEFAULT
NEXTVAL('ethereal.seqPost'),
RealmName VARCHAR(30) NOT NULL,
PuppetName VARCHAR(30),
PuppeteerLogin VARCHAR(10),
PostTo VARCHAR(30),
PostTimestamp TIMESTAMP NOT NULL DEFAULT LOCALTIMESTAMP,
PostClass VARCHAR(10) NOT NULL DEFAULT 'general',
PostCmd VARCHAR(10) NOT NULL DEFAULT 'none',
PostFullFormat TEXT,
PostImagelessFormat TEXT,
PostPartialFormat TEXT,
CONSTRAINT pkPost PRIMARY KEY (PostIDNumber),
CONSTRAINT enumPostClass CHECK (PostCLass IN
('banner','dice','duplicate','general','play','private','special','system')),
CONSTRAINT enumPostCmd CHECK (PostCmd IN
('general','none','play','stream'))
) WITHOUT OIDS;

-- STANDARD INDEX
CREATE INDEX idxPost_Class ON ethereal.Post
(
PostClass
);

CREATE INDEX idxPost_Login ON ethereal.Post
(
PuppeteerLogin
);

CREATE INDEX idxPost_Puppet ON ethereal.Post
(
PuppetName
);

CREATE INDEX idxPost_Realm ON ethereal.Post
(
RealmName
);

CREATE INDEX idxPost_Timestamp ON ethereal.Post
(
PostTimestamp
);

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Stephan Szabo 2003-07-09 00:55:51 Re: Efficiency of timestamps
Previous Message Andrew Sullivan 2003-07-08 20:54:48 Re: Strange result: UNIX vs. TCP/IP sockets