Re: Slow planning time when public schema included (12 vs. 9.4)

From: Anders Steinlein <anders(at)e5r(dot)no>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Slow planning time when public schema included (12 vs. 9.4)
Date: 2020-03-21 15:59:07
Message-ID: CAC35HN=mieW9zXbJAaYkK3_n8meqgA+gtM71q57SFzy+wrEa7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Mar 21, 2020 at 3:26 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Anders Steinlein <anders(at)e5r(dot)no> writes:
> > We haven't noticed any issues with this before now, until we started
> seeing
> > really slow planning time on some relatively simple queries:
> > ...
> > The planning time is the same even if running the same query multiple
> times
> > within the same session. When having only the tenant's schema in the
> > search_path, planning time is much improved:
>
> I notice a difference in these plans:
>
> > Hash Join (cost=452.96..1887.72 rows=1518 width=41) (actual
> > time=6.581..18.845 rows=2945 loops=1)
> > Hash Cond: ((cs.email)::citext = (cl.email)::citext)
> ^^^^^^ ^^^^^^
>
> > Hash Join (cost=452.96..1887.72 rows=1517 width=41) (actual
> > time=3.980..8.554 rows=2945 loops=1)
> > Hash Cond: ((cs.email)::text = (cl.email)::text)
> ^^^^ ^^^^
>
> I think what is happening is that the "cl.email = cs.email" clause
> is resolving as a different operator depending on your search path;
> probably there is a "citext = citext" operator in the public
> schema, and if available the parser will think it's a better match
> than the "text = text" operator. However, "citext = citext" can
> be orders of magnitude slower, depending on what locale settings
> you're using. That's affecting your planning time (since the
> planner will apply the operator to the values available from
> pg_stats), and it's also visibly affecting the query runtime.
>
> Not sure why you'd not have seen the same effect in your 9.4
> installation, but maybe you had citext installed somewhere else?
>

The citext extension is installed in the public schema in both instances.
Also, the second query example that I could run on both 12 and 9.4 runs
with the citext comparison in both cases. From 9.4:

mm_prod=> explain analyze select * from segments_with_contacts swc inner
join segments s using (sid) inner join contacts_lists cl on cl.email =
swc.email and s.lid = cl.lid where swc.sid = 34983 and lstatus = 'a';

QUERY PLAN

------------------------------------------------------------
------------------------------------------------------------
---------------------------------------
Hash Join (cost=700.92..1524.11 rows=444 width=187) (actual
time=13.800..20.009 rows=2295 loops=1)
Hash Cond: (swc.email = (cl.email)::citext)
^^^^^^^^
mm2_prod=> \d segments_with_contacts
Materialized view "eliksir.segments_with_contacts"
Column | Type | Modifiers
--------+---------+-----------
lid | integer |
sid | integer |
email | citext |

The tables segments and contacts_lists are identical on the two instances,
i.e. both are using citext (email domain using the citext type) on both 12
and 9.4, with the citext extension in the public schema. Is it the
lc_collate setting citext cares about? lc_collate=nb_NO.UTF-8 on both 9.4
and 12.

So I don't understand this big difference? Because it does seem like citext
is indeed the difference. I tried to modify the query to cast before
joining, and it is indeed much improved:

mm_prod=> set search_path = eliksir, public;
SET
mm_prod=> explain analyze select cs.* from contacts_segments cs inner join
segments s on s.sid = cs.segment_id inner join contacts_lists cl on
lower(cl.email::text) = lower(cs.email::text) and cl.lid = s.lid where
cs.segment_id = 34983 and cl.lstatus = 'a';

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=2518.61..4565.20 rows=101259 width=41) (actual
time=47.278..51.686 rows=2945 loops=1)
Merge Cond: ((lower((cl.email)::text)) = (lower((cs.email)::text)))
-> Sort (cost=419.78..425.24 rows=2187 width=25) (actual
time=18.283..18.516 rows=4646 loops=1)
Sort Key: (lower((cl.email)::text))
Sort Method: quicksort Memory: 665kB
-> Nested Loop (cost=0.56..298.45 rows=2187 width=25) (actual
time=0.057..9.805 rows=4646 loops=1)
-> Index Scan using segments_pkey on segments s
(cost=0.27..2.49 rows=1 width=8) (actual time=0.021..0.022 rows=1 loops=1)
Index Cond: (sid = 34983)
-> Index Scan using contacts_lists_lid_idx on
contacts_lists cl (cost=0.29..288.53 rows=744 width=25) (actual
time=0.023..4.953 rows=4646 loops=1)
Index Cond: (lid = s.lid)
Filter: ((lstatus)::bpchar = 'a'::bpchar)
Rows Removed by Filter: 6628
-> Sort (cost=2098.83..2121.98 rows=9258 width=41) (actual
time=28.988..29.373 rows=9258 loops=1)
Sort Key: (lower((cs.email)::text))
Sort Method: quicksort Memory: 1598kB
-> Bitmap Heap Scan on contacts_segments cs
(cost=127.17..1488.89 rows=9258 width=41) (actual time=0.511..7.910
rows=9258 loops=1)
Recheck Cond: (segment_id = 34983)
Heap Blocks: exact=1246
-> Bitmap Index Scan on contacts_segments_segment_id_idx
(cost=0.00..124.86 rows=9258 width=0) (actual time=0.390..0.391 rows=9258
loops=1)
Index Cond: (segment_id = 34983)
Planning Time: 0.416 ms
Execution Time: 51.924 ms
(22 rows)

Best,
-- a.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2020-03-21 19:35:51 Re: Slow planning time when public schema included (12 vs. 9.4)
Previous Message Anders Steinlein 2020-03-21 15:45:47 Re: Slow planning time when public schema included (12 vs. 9.4)