Re: PostgreSQL Views Where Clause Character Columns getting Typecasted to Text and Indexes not getting Used

From: Jagmohan Kaintura <jagmohan(at)tecorelabs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: PostgreSQL Views Where Clause Character Columns getting Typecasted to Text and Indexes not getting Used
Date: 2021-03-30 04:21:12
Message-ID: CA+cYFtsw0xucDXjUEkikw1ajYsywXxt=jYgJLnN6HftrT-00yA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Tom,

Thanks for your analysis!!
As we were discussing on the typecasted column not able to use the indexes
as base table columns are with datatype varchar and they are type casted to
text internally .

We made some more analysis and tried to create some test tables with
columns datatype as *text *which are being referenced in the *where* clause.

tms_vessel_visit_aggregate_test
tms_owner_cz1qa(at)zdcq12c2:5434=> \d tms_vessel_visit_aggregate_test
Table "tms_owner_cz1qa.tms_vessel_visit_aggregate_test"
Column | Type | Collation |
Nullable | Default
-------------------------+--------------------------------+-----------+----------+---------
vessel_visit_c | text | | not
null |
....

tms_vessel_visit_aggregate_bak_test
tms_owner_cz1qa(at)zdcq12c2:5434=> \d tms_vessel_visit_aggregate_bak_test
Table "tms_owner_cz1qa.tms_vessel_visit_aggregate_bak_test"
Column | Type | Collation |
Nullable | Default
-------------------------+--------------------------------+-----------+----------+---------
vessel_visit_c | text | | not
null |
...

tms_vessel_visit_test
tms_owner_cz1qa(at)zdcq12c2:5434=> \d tms_vessel_visit_test
Table "tms_owner_cz1qa.tms_vessel_visit_test"
Column | Type | Collation
| Nullable | Defa
ult
-------------------------------+--------------------------------+-----------+----------+-------
-------
vessel_visit_c | text |
| not null |
....

tms_vessel_visit_bak_test
...
vessel_visit_c | text |
| not null |
....

After above modification created a new View TMV_VESSEL_VISIT_TEXT_VERIFY.
Now as the base columns are *text and *typecast is also* text. *We are
getting a good optimum plan as we expected.

The application call which was earlier taking 7-8 seconds for execution is
now taking 900-1100 msec. It got improved by many factors.
From the application single call view is getting executed 4k times.

Aggregate (cost=47.91..47.92 rows=1 width=32) (actual time=0.021..0.024
rows=1 loops=1)
Buffers: shared hit=5
-> Nested Loop (cost=4.49..47.91 rows=1 width=0) (actual
time=0.019..0.021 rows=0 loops=1)
Buffers: shared hit=5
-> Nested Loop (cost=0.57..41.83 rows=1 width=38) (actual
time=0.019..0.020 rows=0 loops=1)
Join Filter: ((t.trgt_vvd_n)::text = "*SELECT*
1".vessel_visit_c)
Buffers: shared hit=5

* -> Index Scan using tms_tdr_pk on tms_tdr t
(cost=0.29..2.50 rows=1 width=7) (actual time=0.011..0.012 rows=1 loops=1)*
Index Cond: (tdr_id = '13581258'::numeric)
Buffers: shared hit=3
-> Append (cost=0.28..39.30 rows=2 width=70) (actual
time=0.005..0.006 rows=0 loops=1)
Buffers: shared hit=2
-> Subquery Scan on "*SELECT* 1" (cost=0.28..19.65
rows=1 width=70) (actual time=0.002..0.003 rows=0 loops=1)
Buffers: shared hit=1
-> Nested Loop (cost=0.28..19.64 rows=1
width=6418) (actual time=0.002..0.002 rows=0 loops=1)
Buffers: shared hit=1

* -> Index Scan using
tms_vessel_visit_test_vessel_visit_c_idx on tms_vessel_visit_test v
(cost=0.14..9.09 rows=10 width=80) (actual time=0.002..0.002 rows=0
loops=1) * Buffers: shared hit=1

* -> Index Scan using
tms_vessel_visit_aggregate_test_vessel_visit_c_idx on
tms_vessel_visit_aggregate_test a (cost=0.14..1.05 rows=1 width=42) (never
executed) * Index Cond:
(vessel_visit_c = v.vessel_visit_c)
Filter: (v.vessel_call_nr =
vessel_call_nr)
-> Subquery Scan on "*SELECT* 2" (cost=0.28..19.65
rows=1 width=70) (actual time=0.002..0.002 rows=0 loops=1)
Buffers: shared hit=1
-> Nested Loop (cost=0.28..19.64 rows=1
width=6404) (actual time=0.002..0.002 rows=0 loops=1)
Buffers: shared hit=1
-> Index Scan using
tms_vessel_visit_bak_test_pkey on tms_vessel_visit_bak_test v_1
(cost=0.14..9.09 rows=10 width=80)
(actual time=0.002..0.002 rows=0 loops=1)
Buffers: shared hit=1

* -> Index Scan using
tms_vessel_visit_aggregate_bak_test_pkey on
tms_vessel_visit_aggregate_bak_test a_1 (cost=0.14..1.05 rows=1 width=42)
(never executed) Index Cond:
(vessel_visit_c = v_1.vessel_visit_c)*
Filter: (v_1.vessel_call_nr = vessel_call_nr)
-> Bitmap Heap Scan on tms_tdr_region_map m (cost=3.92..6.06
rows=2 width=38) (never executed)
Recheck Cond: ((vsl_svc_c)::text = ("*SELECT*
1".svc_c)::text)
-> Bitmap Index Scan on tms_tdr_region_map_pk
(cost=0.00..3.92 rows=2 width=0) (never executed)
Index Cond: ((vsl_svc_c)::text = ("*SELECT*
1".svc_c)::text)

Another one where we have columns as varchar in the database the plan is
with seq join and application being tested is taking around 7-8 seconds.

Aggregate (cost=145.19..145.20 rows=1 width=32) (actual time=3.819..3.823
rows=1 loops=1)
Buffers: shared hit=71
-> Nested Loop (cost=39.54..145.16 rows=11 width=0) (actual
time=3.817..3.821 rows=0 loops=1)
Buffers: shared hit=71
-> Hash Join (cost=39.39..128.04 rows=5 width=5) (actual
time=1.154..3.806 rows=1 loops=1)
Hash Cond: (("*SELECT* 1".vessel_visit_c)::text =
(t.trgt_vvd_n)::text)
Buffers: shared hit=70
-> Append (cost=36.88..122.85 rows=996 width=13) (actual
time=0.806..3.569 rows=1009 loops=1)
Buffers: shared hit=67
-> Subquery Scan on "*SELECT* 1" (cost=36.88..115.01
rows=995 width=13) (actual time=0.806..3.318 rows=995 loops=1)
Buffers: shared hit=65
-> Hash Join (cost=36.88..105.06 rows=995
width=6361) (actual time=0.805..3.124 rows=995 loops=1)
Hash Cond: (((v.vessel_visit_c)::text =
(a.vessel_visit_c)::text) AND (v.vessel_call_nr = a.vessel_call_nr))
Buffers: shared hit=65

* -> Seq Scan on tms_vessel_visit v
(cost=0.00..62.95 rows=995 width=18) (actual time=0.007..0.278 rows=995
loops=1)* Buffers: shared hit=53
-> Hash (cost=21.95..21.95 rows=995
width=13) (actual time=0.774..0.775 rows=995 loops=1)
Buckets: 1024 Batches: 1 Memory
Usage: 52kB
Buffers: shared hit=12

* -> Seq Scan on
tms_vessel_visit_aggregate a (cost=0.00..21.95 rows=995 width=13) (actual
time=0.008..0.275 rows=995 loops=1)*
Buffers: shared hit=12
-> Subquery Scan on "*SELECT* 2" (cost=1.35..2.86
rows=1 width=88) (actual time=0.064..0.099 rows=14 loops=1)
Buffers: shared hit=2
-> Hash Join (cost=1.35..2.85 rows=1
width=6422) (actual time=0.064..0.095 rows=14 loops=1)
Hash Cond: (((a_1.vessel_visit_c)::text =
(v_1.vessel_visit_c)::text) AND (a_1.vessel_call_nr = v_1.vessel_call_nr))
Buffers: shared hit=2

* -> Seq Scan on
tms_vessel_visit_aggregate_bak a_1 (cost=0.00..1.33 rows=33 width=60)
(actual time=0.011..0.015 rows=33loops=1)*
Buffers: shared hit=1
-> Hash (cost=1.14..1.14 rows=14
width=98) (actual time=0.027..0.027 rows=14 loops=1)
Buckets: 1024 Batches: 1 Memory
Usage: 9kB
Buffers: shared hit=1

* -> Seq Scan on
tms_vessel_visit_bak v_1 (cost=0.00..1.14 rows=14 width=98) (actual
time=0.006..0.017 rows=14 loops=1)*
Buffers: shared hit=1
-> Hash (cost=2.50..2.50 rows=1 width=7) (actual
time=0.019..0.019 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=3
-> Index Scan using tms_tdr_pk on tms_tdr t
(cost=0.29..2.50 rows=1 width=7) (actual time=0.015..0.016 rows=1 loops=1)
Index Cond: (tdr_id = '13581258'::numeric)
Buffers: shared hit=3
-> Index Only Scan using tms_tdr_region_map_pk on
tms_tdr_region_map m (cost=0.15..3.40 rows=2 width=38) (actual
time=0.004..0.004 rows=0 loops
=1)
Index Cond: (vsl_svc_c = ("*SELECT* 1".svc_c)::text)
Heap Fetches: 0
Buffers: shared hit=1

Can you please help with any method where we can try to cast the views
creation with actual data type casting rather than text for characters.
Or are there any additional index classes which we can create to overcome
this situation.

Our more focus is if we can stop from typecasting to *text and t*est how an
application behaves.

On Fri, Mar 26, 2021 at 10:38 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Jagmohan Kaintura <jagmohan(at)tecorelabs(dot)com> writes:
> > When we are compiling View definition into database , the where clause on
> > the Character Columns is getting internally typecasted to text.
>
> That's normal. In Postgres, varchar has no operators of its own;
> everything you might do with it involves a (no-cost) cast to text.
> You might think of varchar as being a domain over text, though for
> historical reasons it's not implemented exactly that way.
>
> > These Columns are Primary Key columns and because of this Type Casting on
> > those columns Index scan is not happening and we are always getting
> > Sequential Scan.
>
> This, on the other hand, is just nonsense. You haven't presented
> any examples in which an indexscan looks likely to be a win.
> Postgres generally won't bother with an index when scanning a tiny
> table. Nor is an index necessarily helpful when doing a join,
> unless the join is designed to access just a small part of the table.
>
> > Please help us in identifying the solution.
>
> Please read
> https://wiki.postgresql.org/wiki/Slow_Query_Questions
>
> regards, tom lane
>

--
*Best Regards,*
Jagmohan

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Luca Ferrari 2021-03-30 06:38:34 Re: questions about wraparound
Previous Message Steve Baldwin 2021-03-30 00:40:00 Re: Row not immediately visible after commit