Re: Equivalence Classes when using IN

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Kim Rose Carlsen <krc(at)hiper(dot)dk>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Equivalence Classes when using IN
Date: 2017-10-09 10:47:59
Message-ID: CAKJS1f8EhaRwra-mc4bm4Q4ButmTi-7W4kn6wicB5SySb5bvnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 9 October 2017 at 22:39, Kim Rose Carlsen <krc(at)hiper(dot)dk> wrote:
> EXPLAIN ANALYZE
> SELECT *
> FROM customer
> JOIN view_customer
> ON customer.customer_id = view_customer.customer_id
> WHERE age < 20;
>
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop Left Join (cost=139.00..10392.96 rows=668 width=16) (actual time=0.528..35.120 rows=200 loops=1)
> Join Filter: (c.customer_id = product.customer_id)
> Rows Removed by Join Filter: 199900
> -> Nested Loop (cost=0.28..199.21 rows=334 width=12) (actual time=0.075..1.146 rows=100 loops=1)
> -> Seq Scan on customer (cost=0.00..21.51 rows=334 width=8) (actual time=0.067..0.282 rows=100 loops=1)
> Filter: (age < 20)
> Rows Removed by Filter: 901
> -> Index Only Scan using customer_pkey on customer c (cost=0.28..0.53 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=100)
> Index Cond: (customer_id = customer.customer_id)
> Heap Fetches: 100
> -> Materialize (cost=138.73..173.75 rows=2001 width=8) (actual time=0.005..0.130 rows=2001 loops=100)
> -> Sort (cost=138.73..143.73 rows=2001 width=8) (actual time=0.448..0.588 rows=2001 loops=1)
> Sort Key: product.customer_id, product.product_id
> Sort Method: quicksort Memory: 142kB
> -> Seq Scan on product (cost=0.00..29.01 rows=2001 width=8) (actual time=0.006..0.215 rows=2001 loops=1)
> Planning time: 0.214 ms
> Execution time: 35.284 ms

You would benefit from adding the age column to view_customer, or at
least consider having some view which contains all the columns you'll
ever need from those tables and if you need special views with only a
subset of columns due to some software doing "select * from
viewname;", then you could just create some. Joining to the same table
again seems like a bit of a waste of effort for the planner and
executor. I'd assume customer_id is the PRIMARY KEY of customer and
is unique.

It's not all that clear what your view is doing here. Confusingly
there's a Sort in the plan, yet nothing in the query asked for that,
so I guess that the view must have an ORDER BY. If you get rid of that
the planner would likely use an index on product (customer_id) to
parameterise the nested loop, at least, it likely would, if you have
one.

It's pretty bad practice to have ORDER BY in views. I kinda wish we
didn't even allow it, but that ship sailed many years ago...

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Durumdara 2017-10-09 11:12:29 Error: "cached plan must not change result type"
Previous Message Kim Rose Carlsen 2017-10-09 09:39:39 Re: Equivalence Classes when using IN