Re: Optimize query for listing un-read messages

From: Andreas Joseph Krogh <andreas(at)visena(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Optimize query for listing un-read messages
Date: 2014-05-03 21:29:21
Message-ID: OfficeNetEmail.2a.be4b507bf0f05a09.145c3fd0696@prod2
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

På lørdag 03. mai 2014 kl. 23:21:21, skrev Alban Hertroys <haramrae(at)gmail(dot)com
<mailto:haramrae(at)gmail(dot)com>>:
On 03 May 2014, at 12:45, Andreas Joseph Krogh <andreas(at)visena(dot)com> wrote:

> Do you really need to query message_property twice? I would think this
would give the same results:
>
> SELECT
>     m.id                          AS message_id,
>     prop.person_id,
>     coalesce(prop.is_read, FALSE) AS is_read,
>     m.subject
> FROM message m
>     LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND
prop.person_id = 1 AND prop.is_read = FALSE
> ;

Ah yes, of course that would match a bit too much. This however does give the
same results:

SELECT
   m.id                          AS message_id,
   prop.person_id,
   coalesce(prop.is_read, FALSE) AS is_read,
   m.subject
FROM message m
   LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND
prop.person_id = 1
WHERE prop.is_read IS NULL OR prop.is_read = FALSE
;

That shaves off half the time of the query here, namely one indexscan.

The remaining time appears to be spent finding the rows in “message" that do
not have a corresponding “message_property" for the given (message_id,
person_id) tuple. It’s basically trying to find no needle in a haystack, you
won’t know that there is no needle until you’ve searched the entire haystack.

It does seem to help a bit to create separate indexes on
message_property.message_id and  message_property.person_id; that reduces the
sizes of the indexes that the database needs to match and merge other in order
to find the missing message_id’s.   I think the consesus here is to create a
caching-table, there's no way around it as PG is unable to index the difference
between two sets.   -- Andreas Jospeh Krogh CTO / Partner - Visena AS Mobile:
+47 909 56 963 andreas(at)visena(dot)com <mailto:andreas(at)visena(dot)com> www.visena.com
<https://www.visena.com> <https://www.visena.com>  

In response to

Browse pgsql-general by date

  From Date Subject
Next Message DrakoRod 2014-05-03 23:29:01 Server continuously enters to recovery mode.
Previous Message Alban Hertroys 2014-05-03 21:21:21 Re: Optimize query for listing un-read messages