Re: Understanding "seq scans"

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Lele Gaifax <lele(at)metapensiero(dot)it>, pgsql-general(at)postgresql(dot)org
Subject: Re: Understanding "seq scans"
Date: 2015-10-12 21:04:16
Message-ID: 561C2050.1070506@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 10/12/2015 12:06 PM, Lele Gaifax wrote:
> Hi all,
>
> I'm doing some experiments to find the better layout for reimplementing
> an existing db (MySQL cough!) with PostgreSQL 9.4+.
>
> I noticed a strange plan coming out from a simple query joining two tables,
> both containing 10Mrecs (and both ANALYZEd):
>
> l10ntest=# \d master;
> Table "public.master"
> Column | Type | Modifiers
> --------+---------+------------------------------------------------------
> num | integer | not null default nextval('master_num_seq'::regclass)
> Indexes:
> "master_pkey" PRIMARY KEY, btree (num)
>
> l10ntest=# \d master_l10n;
> Table "public.master_l10n"
> Column | Type | Modifiers
> --------+----------------------+-----------------------------------------------------------
> num | integer | not null default nextval('master_l10n_num_seq'::regclass)
> lang | character varying(2) | not null
> text | text |
> Indexes:
> "master_l10n_pkey" PRIMARY KEY, btree (num, lang)
> "l10n_text_index" btree (lower(text) text_pattern_ops)
>
> l10ntest=# EXPLAIN SELECT count(l.num) AS count_1 FROM master_l10n l WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------
> Aggregate (cost=309315.38..309315.39 rows=1 width=4)
> -> Bitmap Heap Scan on master_l10n l (cost=64700.56..307801.65 rows=605492 width=4)
> Filter: (((lang)::text = 'it'::text) AND (lower(text) ~~ 'quattro%'::text))
> -> Bitmap Index Scan on l10n_text_index (cost=0.00..64549.19 rows=999662 width=0)
> Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND (lower(text) ~<~ 'quattrp'::text))
> (5 rows)
>
> Time: 1.665 ms
>
> l10ntest=# EXPLAIN SELECT count(m.num) AS count_1 FROM master_l10n l JOIN master m ON m.num = l.num WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
> Aggregate (cost=676558.14..676558.15 rows=1 width=4)
> -> Hash Join (cost=373011.02..675044.41 rows=605492 width=4)
> Hash Cond: (l.num = m.num)
> -> Bitmap Heap Scan on master_l10n l (cost=64700.56..307801.65 rows=605492 width=4)
> Filter: (((lang)::text = 'it'::text) AND (lower(text) ~~ 'quattro%'::text))
> -> Bitmap Index Scan on l10n_text_index (cost=0.00..64549.19 rows=999662 width=0)
> Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND (lower(text) ~<~ 'quattrp'::text))
> -> Hash (cost=144247.76..144247.76 rows=9999976 width=4)
> -> Seq Scan on master m (cost=0.00..144247.76 rows=9999976 width=4)
> (9 rows)
>
> Time: 1.244 ms
>
> l10ntest=# SELECT count(l.num) AS count_1 FROM master_l10n l WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
> count_1
> ---------
> 1101101
> (1 row)
>
> Time: 1221.941 ms
>
> l10ntest=# SELECT count(m.num) AS count_1 FROM master_l10n l JOIN master m ON m.num = l.num WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
> count_1
> ---------
> 1101101
> (1 row)
>
> Time: 3541.852 ms
>
> Why does the join on the master table require a "Seq Scan on master"? I tried
> different kinds of "JOIN", but the resulting explanation remains the same.
>
> Am I missing something, or should I stop worrying about that sequential scan?

Off hand I would say it is because of this --> count(m.num). Try
count(l.num) instead and see what happens. As your queries above show
they are the same number.

>
> Thanks in advance,
> bye, lele.
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Lele Gaifax 2015-10-12 21:46:22 Re: Understanding "seq scans"
Previous Message Kevin Grittner 2015-10-12 20:34:59 Re: Understanding "seq scans"