From Simple to Complex

From: Alessandro Gagliardi <alessandro(at)path(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: From Simple to Complex
Date: 2012-01-31 21:22:12
Message-ID: CAAB3BB+TB+SkR4cnySdkhzC9VNxaZ4MBcba8SMfz6KP1Pi8mxg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

My slow query today is somewhat more complex than yesterday's, but I'm
hopeful it can be improved. Here's the query:

SELECT relname, emotion, COUNT(feedback_id) FROM pg_class, moments
JOIN emotions USING (moment_id)
WHERE moments.inserted > 'today' AND moments.tableoid = pg_class.oid
GROUP BY relname, emotion ORDER BY relname, emotion;

As you'll see below, moments is inherited by a number of other tables
and the purpose of relname is to see which one. Meanwhile, emotions
inherits feedback.

Here's the Full Table and Index Schema:

CREATE TABLE moments
(
  moment_id character(24) NOT NULL DEFAULT to_char(now(), 'JHH24MISSUS'::text),
  block_id character(24) NOT NULL,
  inserted timestamp without time zone NOT NULL DEFAULT now(),
  CONSTRAINT moments_pkey PRIMARY KEY (moment_id )
)
WITH (
  OIDS=FALSE
);

CREATE INDEX moments_block_id_idx
  ON moments
  USING btree
  (block_id );

CREATE INDEX moments_inserted_idx
  ON moments
  USING btree
  (inserted );

CREATE TABLE feedback
(
feedback_id character(24) NOT NULL,
user_id character(24) NOT NULL,
moment_id character(24) NOT NULL,
created timestamp without time zone,
inserted timestamp without time zone NOT NULL DEFAULT now(),
lnglat point,
CONSTRAINT feedback_pkey PRIMARY KEY (feedback_id )
)
WITH (
OIDS=FALSE
);

CREATE INDEX feedback_lnglat_idx
ON feedback
USING gist
(lnglat );

CREATE INDEX feedback_moment_id_idx
ON feedback
USING btree
(moment_id );

CREATE TABLE emotions
(
-- Inherited from table feedback: feedback_id character(24) NOT NULL,
-- Inherited from table feedback: user_id character(24) NOT NULL,
-- Inherited from table feedback: moment_id character(24) NOT NULL,
-- Inherited from table feedback: created timestamp without time zone,
-- Inherited from table feedback: inserted timestamp without time
zone NOT NULL DEFAULT now(),
emotion character varying NOT NULL,
-- Inherited from table : lnglat point,
CONSTRAINT emotions_pkey PRIMARY KEY (feedback_id )
)
INHERITS (feedback)
WITH (
OIDS=FALSE
);

CREATE INDEX emotions_emotion_idx
ON emotions
USING btree
(emotion );

Here's the results from EXPLAIN ANALYZE:

"Sort (cost=309717.70..309718.43 rows=1460 width=94) (actual
time=60462.534..60462.544 rows=25 loops=1)"
" Sort Key: pg_class.relname, emotions.emotion"
" Sort Method: quicksort Memory: 20kB"
" -> HashAggregate (cost=309697.24..309702.35 rows=1460 width=94)
(actual time=60462.457..60462.476 rows=25 loops=1)"
" -> Hash Join (cost=133154.62..308963.70 rows=489024
width=94) (actual time=26910.488..60031.589 rows=194642 loops=1)"
" Hash Cond: (public.moments.tableoid = pg_class.oid)"
" -> Hash Join (cost=133144.72..307119.96 rows=489024
width=34) (actual time=26909.984..59434.137 rows=194642 loops=1)"
" Hash Cond: (public.moments.moment_id = emotions.moment_id)"
" -> Append (cost=0.00..114981.64 rows=119665
width=29) (actual time=883.153..21696.939 rows=357565 loops=1)"
" -> Seq Scan on moments (cost=0.00..0.00
rows=1 width=104) (actual time=0.000..0.000 rows=0 loops=1)"
" Filter: (inserted > '2012-01-31
00:00:00'::timestamp without time zone)"
" -> Seq Scan on thoughts moments
(cost=0.00..38856.88 rows=44388 width=29) (actual
time=883.150..9040.959 rows=115436 loops=1)"
" Filter: (inserted > '2012-01-31
00:00:00'::timestamp without time zone)"
" -> Seq Scan on photos moments
(cost=0.00..29635.78 rows=194 width=29) (actual
time=5329.700..5827.447 rows=116420 loops=1)"
" Filter: (inserted > '2012-01-31
00:00:00'::timestamp without time zone)"
" -> Seq Scan on music moments
(cost=0.00..9371.88 rows=19070 width=29) (actual time=354.147..383.266
rows=37248 loops=1)"
" Filter: (inserted > '2012-01-31
00:00:00'::timestamp without time zone)"
" -> Seq Scan on people moments
(cost=0.00..5945.26 rows=27 width=29) (actual time=185.393..185.393
rows=0 loops=1)"
" Filter: (inserted > '2012-01-31
00:00:00'::timestamp without time zone)"
" -> Seq Scan on places moments
(cost=0.00..24551.03 rows=54961 width=29) (actual
time=5224.044..5324.517 rows=85564 loops=1)"
" Filter: (inserted > '2012-01-31
00:00:00'::timestamp without time zone)"
" -> Seq Scan on videos moments
(cost=0.00..981.31 rows=734 width=29) (actual time=21.075..28.735
rows=2897 loops=1)"
" Filter: (inserted > '2012-01-31
00:00:00'::timestamp without time zone)"
" -> Seq Scan on facebook_people moments
(cost=0.00..10.84 rows=80 width=104) (actual time=0.001..0.001 rows=0
loops=1)"
" Filter: (inserted > '2012-01-31
00:00:00'::timestamp without time zone)"
" -> Seq Scan on address_people moments
(cost=0.00..10.84 rows=80 width=104) (actual time=0.005..0.005 rows=0
loops=1)"
" Filter: (inserted > '2012-01-31
00:00:00'::timestamp without time zone)"
" -> Seq Scan on path_people moments
(cost=0.00..5606.79 rows=30 width=29) (actual time=211.166..211.166
rows=0 loops=1)"
" Filter: (inserted > '2012-01-31
00:00:00'::timestamp without time zone)"
" -> Seq Scan on sleep moments
(cost=0.00..11.05 rows=100 width=104) (actual time=0.002..0.002 rows=0
loops=1)"
" Filter: (inserted > '2012-01-31
00:00:00'::timestamp without time zone)"
" -> Hash (cost=79292.49..79292.49 rows=4059496
width=55) (actual time=25757.998..25757.998 rows=4058642 loops=1)"
" Buckets: 262144 Batches: 4 Memory Usage: 75211kB"
" -> Seq Scan on emotions
(cost=0.00..79292.49 rows=4059496 width=55) (actual
time=0.012..15969.981 rows=4058642 loops=1)"
" -> Hash (cost=8.88..8.88 rows=292 width=68) (actual
time=0.487..0.487 rows=319 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 28kB"
" -> Seq Scan on pg_class (cost=0.00..8.88
rows=292 width=68) (actual time=0.013..0.234 rows=319 loops=1)"
"Total runtime: 60601.612 ms"

Postgres version: is still 9.0.5

History: N/A (This is the first time I've run this query.)

Hardware: 1.7 GB Cache and other things you'd expect from a Ronin
instance of a Heroku Postgres database.

Maintenance Setup: What Heroku does. As before, vacuum should not be
relevant as there are no deletes or even updates (just inserts and
selects)

WAL Configuration: I still don't know. Heroku hosts the database on
Amazon's servers, so maybe that answers the question?

GUC Settings: As per the yesterday's discussion, I reduced
random_page_cost to 2. Other than that, it's all default.

Bonus question: If that was too simple, here's something even more
complex I'd like to do: I have another table that inherits feedback
called "comments". Ideally, rather than an "emotion" column coming
out, I would like to have a "feedback_type" column that would be
either the value in the emotion column of the emotions table, or
"comment" if it's from the comments table. I'm thinking I'm going to
have to simply put that together on the client, but if I can do that
in a single query (that doesn't take an hour to run) that would be
super cool. But that's definitely secondary compared to getting the
above query to run faster.

Thank you very much for any help!
-Alessandro Gagliardi

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alessandro Gagliardi 2012-01-31 22:10:11 Re: From Simple to Complex
Previous Message Jose Ildefonso Camargo Tolosa 2012-01-31 21:13:12 Re: Having I/O problems in simple virtualized environment