Re: Slow index scan on B-Tree index over timestamp field

From: Igor Neyman <ineyman(at)perceptron(dot)com>
To: Caio Casimiro <casimiro(dot)listas(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Slow index scan on B-Tree index over timestamp field
Date: 2013-11-04 21:26:09
Message-ID: A76B25F2823E954C9E45E32FA49D70EC4283BB9B@mail.corp.perceptron.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

From: Caio Casimiro [mailto:casimiro(dot)listas(at)gmail(dot)com]
Sent: Monday, November 04, 2013 4:10 PM
To: Igor Neyman
Cc: Jeff Janes; pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] Slow index scan on B-Tree index over timestamp field

Hi Neyman, thank you for your answer.
Unfortunately this query runs almost at the same time:

Sort  (cost=4877693.98..4877702.60 rows=3449 width=20) (actual time=25820.291..25821.845 rows=1640 loops=1)
  Sort Key: tt.tweet_id
  Sort Method: quicksort  Memory: 97kB
  Buffers: shared hit=1849 read=32788
  ->  Nested Loop  (cost=247.58..4877491.32 rows=3449 width=20) (actual time=486.839..25814.120 rows=1640 loops=1)
        Buffers: shared hit=1849 read=32788
        ->  Hash Semi Join  (cost=229.62..88553.23 rows=1681 width=8) (actual time=431.654..13209.159 rows=597 loops=1)
              Hash Cond: (t.user_id = relationship.followed_id)
              Buffers: shared hit=3 read=31870
              ->  Index Scan using tweet_creation_time_index on tweet t  (cost=0.57..83308.25 rows=1781234 width=16) (actual time=130.144..10037.764 rows=1759645 loops=1)
                    Index Cond: ((creation_time >= '2013-05-05 00:00:00-03'::timestamp with time zone) AND (creation_time <= '2013-05-06 00:00:00-03'::timestamp with time zone))
                    Buffers: shared hit=1 read=31867
              ->  Hash  (cost=227.12..227.12 rows=154 width=8) (actual time=94.365..94.365 rows=106 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 3kB
                    Buffers: shared hit=2 read=3
                    ->  Index Only Scan using relationship_id on relationship  (cost=0.42..227.12 rows=154 width=8) (actual time=74.540..94.101 rows=106 loops=1)
                          Index Cond: (follower_id = 335093362)
                          Heap Fetches: 0
                          Buffers: shared hit=2 read=3
        ->  Bitmap Heap Scan on tweet_topic tt  (cost=17.96..2841.63 rows=723 width=20) (actual time=21.014..21.085 rows=3 loops=597)
              Recheck Cond: (tweet_id = t.id)
              Buffers: shared hit=1846 read=918
              ->  Bitmap Index Scan on tweet_topic_pk  (cost=0.00..17.78 rows=723 width=0) (actual time=15.012..15.012 rows=3 loops=597)
                    Index Cond: (tweet_id = t.id)
                    Buffers: shared hit=1763 read=632
Total runtime: 25823.386 ms

I have noticed that in both queries the index scan on tweet_creation_time_index is very expensive. Is there anything I can do to make the planner choose a index only scan?

Thank you,
Caio

Just try the following:

SELECT tt.tweet_id, tt.topic, tt.topic_value
            FROM tweet_topic AS tt  JOIN tweet AS t ON (tt.tweet_id = t.id
                                                  AND t.creation_time BETWEEN 'D1' AND 'D2' AND t.user_id in
                                         (SELECT followed_id FROM relationship WHERE follower_id = N))
 ORDER BY tt.tweet_id;

And see if it helps with performance.

Regards,
Igor Neyman

What is your hardware configuration, and Postgres config parameters modified from default values?

Regards,
Igor Neyman

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Caio Casimiro 2013-11-04 21:32:45 Re: Slow index scan on B-Tree index over timestamp field
Previous Message Elliot 2013-11-04 21:22:47 Re: Slow index scan on B-Tree index over timestamp field