Rules, Windows and ORDER BY

From: Jason Dusek <jason(dot)dusek(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Rules, Windows and ORDER BY
Date: 2012-08-23 11:37:09
Message-ID: CAO3NbwMz011_Tve5iTW86uNx3BkRk+fd=aL2fP28sSbg20zArA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello List,

I have a simple table of keys and values which periodically
receives updated values. It's desirable to keep older values
but, most of the time, we query only for the latest value of a
particular key.

CREATE TABLE kv
( k bytea NOT NULL,
at timestamptz NOT NULL,
realm bytea NOT NULL,
v bytea NOT NULL );
CREATE INDEX ON kv USING hash(k);
CREATE INDEX ON kv (t);
CREATE INDEX ON kv USING hash(realm);

SELECT * FROM kv WHERE k = $1 AND realm = $2 ORDER BY at DESC LIMIT 1;

It would be nice to encapsulate this common query with a VIEW;
for example:

CREATE VIEW kv_new AS
SELECT * FROM kv WHERE at =
( SELECT at FROM kv AS _
WHERE _.k = kv.k AND _.realm = kv.realm
ORDER BY at DESC LIMIT 1 );

I tried partition functions, at first, but they were really very
slow. This view is pretty sprightly but has a more complicated
plan than the original query, which only has a sort followed by
an index scan, and is consequently not as fast. Please find the
plans below my signature.

Ideally, I'd be able to create a rule where the ORDER BY and
LIMIT were simply appended to whatever SELECT was given; but I
am at a loss as to how to do that. Creating a VIEW with the
order and limit just gives me a table with one row in it (of
course).

Is there something better than a sub-select here? I tried using
one with max(at) but it's not noticeably faster. I would be
interested to see how others have approached this kind of log-
-structured storage in Postgres. The window functions make,
alas, no use of indexes.

--
Jason Dusek
pgp // solidsnack // C1EBC57DC55144F35460C8DF1FD4C6C1FED18A2B

EXPLAIN (COSTS FALSE, FORMAT YAML)
SELECT * FROM kv WHERE k = ... AND realm = ... ORDER BY at LIMIT 1;
-[ RECORD 1 ]-----------------------------------------
QUERY PLAN | - Plan:
| Node Type: "Limit"
| Plans:
| - Node Type: "Sort"
| Parent Relationship: "Outer"
| Sort Key:
| - "at"
| Plans:
| - Node Type: "Index Scan"
| Parent Relationship: "Outer"
| Scan Direction: "NoMovement"
| Index Name: "kv_k_idx"
| Relation Name: "kv"
| Alias: "kv"
| Index Cond: "(k = ...)"
| Filter: "(realm = ...)"

EXPLAIN (COSTS FALSE, FORMAT YAML)
SELECT * FROM kv_new WHERE k = ... AND realm = ...;
-[ RECORD 1 ]-----------------------------------------------------
QUERY PLAN | - Plan:
| Node Type: "Index Scan"
| Scan Direction: "NoMovement"
| Index Name: "kv_k_idx"
| Relation Name: "kv"
| Alias: "kv"
| Index Cond: "(k = ...)"
| Filter: "((realm = ...) AND (at = (SubPlan 1)))"
| Plans:
| - Node Type: "Limit"
| Parent Relationship: "SubPlan"
| Subplan Name: "SubPlan 1"
| Plans:
| - Node Type: "Sort"
| Parent Relationship: "Outer"
| Sort Key:
| - "_.at"
| Plans:
| - Node Type: "Index Scan"
| Parent Relationship: "Outer"
| Scan Direction: "NoMovement"
| Index Name: "kv_k_idx"
| Relation Name: "kv"
| Alias: "_"
| Index Cond: "(k = kv.k)"
| Filter: "(realm = kv.realm)"

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Vincent Veyron 2012-08-23 11:39:47 Re: Amazon High I/O instances
Previous Message Chris Travers 2012-08-23 09:13:33 Re: At what point does a big table start becoming too big?