Re: newsfeed type query

From: Ladislav Lenart <lenartlad(at)volny(dot)cz>
To: Jonathan Vanasco <postgres(at)2xlp(dot)com>, PostgreSQL mailing lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: newsfeed type query
Date: 2015-04-29 16:25:42
Message-ID: 55410606.2030406@volny.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello.

On 29.4.2015 17:27, Jonathan Vanasco wrote:
>
> Thanks all! These point me in much better directions!
>
> Jim Nasby's approach to selecting an expression addressed some things (SELECT f.useraccount_id_b IS NOT NULL AS in_friends)
>
> Ladislav Lenart's usage of the CTE is also of a different format that I've used in the past.
>
> I think i'll be able to patch together some performance improvements now, that will last until the database structure changes.
>
>
> On Apr 29, 2015, at 6:54 AM, Ladislav Lenart wrote:
>
>> I think you can propagate ORDER BY and LIMIT also to the subqueries of the
>> UNION, i.e.:
>
>
> It behaves a lot better, but doesn't give me the resultset I need. Older data from one subquery is favored to newer data from another

Hmm, I don't understand why it should behave like that. Imagine the following
postings (ts is a relative timestamp):

posting ts context
p0 0 friend
p10 10 group
p20 20 friend
p30 30 group
p40 40 friend
p50 50 group
p60 60 friend

and let's say the LIMIT is 2. Then:
* The first subquery (for friends) should return p60 and p40 (in DESC order).
* The second subquery (for groups) should return p50 and p30 (in DESC order).
* The UNION should return p60 and p50.

Could you please explain to me the error(s) in my reasoning?

Thank you,

Ladislav Lenart

> I use a similar approach on another part of this application -- where the effect on the resultset isn't as pronounced.
> On that query there are over 100 million total stream events. Not using an inner limit runs the query in 7 minutes; limiting the inner subquery to 1MM runs in 70 seconds... and limiting to 10k is around 100ms.
>
>
> On Apr 29, 2015, at 10:16 AM, Melvin Davidson wrote:
>
>> I see others have responded with suggestions to improve query performance,
>> but one thing I noticed when you gave the data structure is there are no
>> no primary keys defined for friends or posting, neither are there any indexes.
>> Was that an omission?
>
> This was a quick functional example to illustrate. The real tables are slightly different but do have pkeys ( 'id' is a bigserial, relationship tables (friends, memberships) use a composite key ). They are aggressively indexed and reindexed on various columns for query performance. sometimes we create an extra index that has multiple columns or partial-columns to make make scans index-only.
>
>
>
>
>
>
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jonathan Vanasco 2015-04-29 16:54:25 Re: newsfeed type query
Previous Message Jonathan Vanasco 2015-04-29 15:27:58 Re: newsfeed type query