Re: Optimize query for listing un-read messages

From: Jochem Berndsen <jochem(at)functor(dot)nl>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimize query for listing un-read messages
Date: 2014-05-01 18:35:07
Message-ID: 536293DB.4010505@functor.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Hi Andreas,

[New to this list, forgive my ignorance.]

On 05/01/2014 01:26 PM, Andreas Joseph Krogh wrote:
> I'm using PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu
My machine has PostgreSQL 9.1.13 on x86_64-unknown-linux-gnu.
> I have a schema where I have lots of messages and some users who might
> have read some of them. When a message is read by a user I create an
> entry i a table message_property holding the property (is_read) for
> that user.
> The schema is as follows:
> drop table if exists message_property;
> drop table if exists message;
> drop table if exists person;
> create table person(
> id serial primary key,
> username varchar not null unique
> );
> create table message(
> id serial primary key,
> subject varchar
> );
> create table message_property(
> message_id integer not null references message(id),
> person_id integer not null references person(id),
> is_read boolean not null default false,
> unique(message_id, person_id)
> );
[snip]
> So, for person 1 there are 10 unread messages, out of a total 1mill. 5
> of those unread does not have an entry in message_property and 5 have
> an entry and is_read set to FALSE.
> I have the following query to list all un-read messages for person
> with id=1:
> SELECT
> m.id AS message_id,
> prop.person_id,
> coalesce(prop.is_read, FALSE) AS is_read,
> m.subject
> FROM message m
> LEFT OUTER JOIN message_property prop ON prop.message_id = m.id
> AND prop.person_id = 1
> WHERE 1 = 1
> AND NOT EXISTS(SELECT
> *
> FROM message_property pr
> WHERE pr.message_id = m.id AND pr.person_id =
> prop.person_id AND prop.is_read = TRUE)
> ;
>
> The problem is that it's not quite efficient and performs badly,
> explain analyze shows:
[snip]

> Does anyone have suggestions on how to optimize the query or schema?

I'm getting better performance with:

SELECT
m.id AS message_id,
1 AS person_id,
FALSE AS is_read,
m.subject
FROM message m
WHERE 1 = 1
AND NOT EXISTS(SELECT
*
FROM message_property pr
WHERE pr.message_id = m.id AND pr.person_id = 1 AND pr.is_read);

You then lose the distinction between message_property with is_read =
FALSE, and nonexistent message_property for the message row.

If that is essential, I'm getting a roughly 2x speedup on my non-tuned
PostgreSQL with:
SELECT
m.id AS message_id,
prop.person_id,
coalesce(prop.is_read, FALSE) AS is_read,
m.subject
FROM message m
LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND
prop.person_id = 1
WHERE not coalesce(prop.is_read, false);

HTH,
Jochem

--
Jochem Berndsen | jochem(at)functor(dot)nl

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andreas Joseph Krogh 2014-05-01 19:17:36 Re: Optimize query for listing un-read messages
Previous Message Andreas Joseph Krogh 2014-05-01 11:26:02 Optimize query for listing un-read messages