Quick Links

Re: From Simple to Complex

From:	Bob Lunney <bob_lunney(at)yahoo(dot)com>
To:	Alessandro Gagliardi <alessandro(at)path(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: From Simple to Complex
Date:	2012-02-01 19:04:04
Message-ID:	1328123044.31389.YahooMailNeo@web39707.mail.mud.yahoo.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Possibly. What does

psql > show work_mem;

say?

Bob Lunney

________________________________
From: Alessandro Gagliardi <alessandro(at)path(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Sent: Wednesday, February 1, 2012 12:19 PM
Subject: Re: [PERFORM] From Simple to Complex

Final update on this thread: since it is only necessary for me to get a rough ratio of the distribution (and not the absolute count), I refactored the query to include a subquery that samples from the moments table thus: SELECT moment_id, block_id FROM moments WHERE inserted BETWEEN 'yesterday' AND 'today' ORDER BY RANDOM() LIMIT 10000; I also took advantage of another table called blocks that happens to contain the moment_type as well (thus making it so I don't need to reference pg_class). The final query looks like:

SELECT moment_type, emotion, COUNT(feedback_id)
FROM (SELECT moment_id, block_id
FROM moments
WHERE inserted BETWEEN 'yesterday' AND 'today'
ORDER BY RANDOM() LIMIT 10000) AS sample_moments
JOIN blocks USING (block_id)
JOIN emotions USING (moment_id)
GROUP BY moment_type, emotion
ORDER BY moment_type, emotion

The explain is at http://explain.depesz.com/s/lYh

Interestingly, increasing the limit does not seem to increase the runtime in a linear fashion. When I run it with a limit of 60000 I get a runtime of 14991 ms. But if I run it with a limit of 70000 I get a runtime of 77744 ms. I assume that that's because I'm hitting a memory limit and paging out. Is that right?

On Tue, Jan 31, 2012 at 3:43 PM, Alessandro Gagliardi <alessandro(at)path(dot)com> wrote:

I just got a pointer on presenting EXPLAIN ANALYZE in a more human friendly fashion (thanks, Agent M!): http://explain.depesz.com/s/A9S
>
>
>From this it looks like the bottleneck happens when Postgres does an Index Scan using emotions_moment_id_idx on emotions before filtering on moments.inserted so I thought I'd try filtering on emotions.inserted instead but that only made it worse. At the same time, I noticed that "FROM pg_class, moments WHERE moments.tableoid = pg_class.oid" tends to run a bit faster than "FROM pg_class JOIN moments ON moments.tableoid = pg_class.oid". So I tried:
>
>
>SELECT relname, emotion, COUNT(feedback_id)
> FROM pg_class, moments, emotions
> WHERE moments.tableoid = pg_class.oid
> AND emotions.inserted > 'yesterday'
> AND moments.inserted BETWEEN 'yesterday' AND 'today'
> AND emotions.moment_id = moments.moment_id
> GROUP BY relname, emotion
> ORDER BY relname, emotion;
>
>
>That was a bit faster, but still very slow. Here's the EXPLAIN: http://explain.depesz.com/s/ZdF
>
>
>On Tue, Jan 31, 2012 at 2:53 PM, Alessandro Gagliardi <alessandro(at)path(dot)com> wrote:
>
>I changed the query a bit so the results would not change over the
>>course of the day to:
>>
>>
>>SELECT relname, emotion, COUNT(feedback_id) FROM pg_class, moments
>>JOIN emotions USING (moment_id)
>>WHERE moments.inserted BETWEEN 'yesterday' AND 'today' AND
>>
>>moments.tableoid = pg_class.oid
>>GROUP BY relname, emotion ORDER BY relname, emotion;
>>

In response to

Re: From Simple to Complex at 2012-02-01 18:19:28 from Alessandro Gagliardi

Responses

Re: From Simple to Complex at 2012-02-01 19:19:27 from Alessandro Gagliardi

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Alessandro Gagliardi	2012-02-01 19:19:27	Re: From Simple to Complex
Previous Message	Scott Marlowe	2012-02-01 18:53:06	Re: From Simple to Complex