Re: Query that took a lot of time in Postgresql when not using trim in order by

From: Alex Ignatov <a(dot)ignatov(at)postgrespro(dot)ru>
To: Blas Pico <toni(dot)pico(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Query that took a lot of time in Postgresql when not using trim in order by
Date: 2015-11-25 16:22:28
Message-ID: 5655E044.6070207@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 25.11.2015 17:15, Blas Pico wrote:
>
> I have a query that produce a different query plan if I put a trim in
> one of the columns in the order by.
>
>
> When i put the trim in any column it use hashaggregate and took 3
> seconds against 30 when not.
>
>
> Is wear because the columns is clean not need to be trimmed, I have
> check that.
>
>
> The problem is that I can't change the query because is generate by
> the mondrian.
>
>
> I do research and found in postgres list that I need to crank work_mem
> up high but don't work for me.
>
>
>
> My postgresql.conf
>
>
> # Add settings for extensions here
>
>
> default_statistics_target = 50 # pgtune wizard 2014-06-04
>
>
> maintenance_work_mem = 1GB # pgtune wizard 2014-06-04
>
>
> constraint_exclusion = on # pgtune wizard 2014-06-04
>
>
> checkpoint_completion_target = 0.9 # pgtune wizard 2014-06-04
>
>
> effective_cache_size = 44GB # pgtune wizard 2014-06-04
>
>
> work_mem = 1536MB # pgtune wizard 2014-06-04
>
>
> #work_mem = 16GB # I have try this but don't work
>
>
> wal_buffers = 32MB # pgtune wizard 2014-06-04
>
>
> checkpoint_segments = 16 # pgtune wizard 2014-06-04
>
>
> shared_buffers = 15GB # pgtune wizard 2014-06-04
>
>
> max_connections = 20 # pgtune wizard 2014-06-04
>
>
>
>
> ___________________________________________________
>
>
>
>
> Query with trim
>
>
> SELECT "dim_cliente"."tipocliente" AS "c0",
>
>
> "dim_cliente"."a1_ibge" AS "c1",
>
>
> "dim_cliente"."a1_cod" AS "c2",
>
>
> "dim_cliente"."a1_nome" AS "c3",
>
>
> "dim_vendedor"."a3_nome" AS "c4"
>
>
> FROM "public"."dim_cliente" AS "dim_cliente",
>
>
> "public"."fato_ventas_productos" AS "fato_ventas_productos",
>
>
> "public"."dim_vendedor" AS "dim_vendedor"
>
>
> WHERE "fato_ventas_productos"."key_cliente" =
> "dim_cliente"."key_cliente"
>
>
> AND "fato_ventas_productos"."key_vendedor" =
> "dim_vendedor"."key_vendedor"
>
>
> GROUP
>
>
> BY "dim_cliente"."tipocliente" ,
>
>
> "dim_cliente"."a1_ibge",
>
>
> "dim_cliente"."a1_cod",
>
>
> "dim_cliente"."a1_nome",
>
>
> "dim_vendedor"."a3_nome"
>
>
> ORDER
>
>
> BY trim("dim_cliente"."tipocliente") ASC NULLS LAST,
>
>
> "dim_cliente"."a1_ibge" ASC NULLS LAST, -- the same result if I put
> the trim here
>
>
> "dim_cliente"."a1_cod" ASC NULLS LAST, -- or here
>
>
> "dim_cliente"."a1_nome" ASC NULLS LAST; -- or here
>
>
> -- this query took 3845.895 ms
>
>
>
>
> ___________________________________________________
>
>
>
>
> Query Plan when using trim
>
>
> QUERY PLAN
>
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> Sort (cost=193101.41..195369.80 rows=907357 width=129) (actual
> time=3828.176..3831.261 rows=43615 loops=1)
>
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome,
> (btrim((dim_cliente.tipocliente)::text))
>
>
> Sort Key: (btrim((dim_cliente.tipocliente)::text)),
> dim_cliente.a1_ibge, dim_cliente.a1_cod, dim_cliente.a1_nome
>
>
> Sort Method: quicksort Memory: 13121kB
>
>
> -> HashAggregate (cost=91970.52..103312.49 rows=907357 width=129)
> (actual time=2462.690..2496.729 rows=43615 loops=1)
>
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome,
> btrim((dim_cliente.tipocliente)::text)
>
>
> -> Hash Join (cost=856.30..80628.56 rows=907357 width=129)
> (actual time=29.524..1533.880 rows=907357 loops=1)
>
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome
>
>
> Hash Cond: (fato_ventas_productos.key_vendedor =
> dim_vendedor.key_vendedor)
>
>
> -> Hash Join (cost=830.02..68126.13 rows=907357 width=86) (actual
> time=28.746..1183.691 rows=907357 loops=1)
>
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome,
> fato_ventas_productos.key_vendedor
>
>
> Hash Cond: (fato_ventas_productos.key_cliente =
> dim_cliente.key_cliente)
>
>
> -> Seq Scan on public.fato_ventas_productos
> (cost=0.00..46880.57 rows=907357 width=16) (actual
> time=0.004..699.779 rows=907357 loops=1)
>
>
> Output: fato_ventas_productos.key_cliente,
> fato_ventas_productos.key_vendedor
>
>
> -> Hash (cost=618.90..618.90 rows=16890 width=86) (actual
> time=28.699..28.699 rows=16890 loops=1)
>
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome, dim_cliente.key_cliente
>
>
> Buckets: 2048 Batches: 1 Memory Usage: 1980kB
>
>
> -> Seq Scan on public.dim_cliente (cost=0.00..618.90
> rows=16890 width=86) (actual time=0.008..16.537 rows=16890 loops=1)
>
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome, dim_cliente.key_cliente
>
>
> -> Hash (cost=18.90..18.90 rows=590 width=59) (actual
> time=0.747..0.747 rows=590 loops=1)
>
>
> Output: dim_vendedor.a3_nome, dim_vendedor.key_vendedor
>
>
> Buckets: 1024 Batches: 1 Memory Usage: 56kB
>
>
> -> Seq Scan on public.dim_vendedor (cost=0.00..18.90 rows=590
> width=59) (actual time=0.026..0.423 rows=590 loops=1)
>
>
> Output: dim_vendedor.a3_nome, dim_vendedor.key_vendedor
>
>
> Total runtime: 3845.895 ms
>
>
> (25 filas)
>
>
>
> ___________________________________________________
>
>
> Query without trim
>
> SELECT "dim_cliente"."tipocliente" AS "c0",
>
> "dim_cliente"."a1_ibge" AS "c1",
>
> "dim_cliente"."a1_cod" AS "c2",
>
> "dim_cliente"."a1_nome" AS "c3",
>
> "dim_vendedor"."a3_nome" AS "c4"
>
> FROM "public"."dim_cliente" AS "dim_cliente",
>
> "public"."fato_ventas_productos" AS "fato_ventas_productos",
>
> "public"."dim_vendedor" AS "dim_vendedor"
>
> WHERE "fato_ventas_productos"."key_cliente" =
> "dim_cliente"."key_cliente"
>
> AND "fato_ventas_productos"."key_vendedor" =
> "dim_vendedor"."key_vendedor"
>
> GROUP
>
> BY "dim_cliente"."tipocliente" ,
>
> "dim_cliente"."a1_ibge",
>
> "dim_cliente"."a1_cod",
>
> "dim_cliente"."a1_nome",
>
> "dim_vendedor"."a3_nome"
>
> ORDER
>
> BY "dim_cliente"."tipocliente" ASC NULLS LAST,
>
> "dim_cliente"."a1_ibge" ASC NULLS LAST,
>
> "dim_cliente"."a1_cod" ASC NULLS LAST,
>
> "dim_cliente"."a1_nome" ASC NULLS LAST;
>
> -- this query took 37249.268 ms
>
>
> ___________________________________________________
>
>
> Query Plan when not using trim
>
> QUERY PLAN
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Group (cost=170417.48..184027.84 rows=907357 width=129) (actual
> time=36649.329..37235.158 rows=43615 loops=1)
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome
>
> -> Sort (cost=170417.48..172685.88 rows=907357 width=129) (actual
> time=36649.315..36786.760 rows=907357 loops=1)
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome
>
> Sort Key: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome
>
> Sort Method: quicksort Memory: 265592kB
>
> -> Hash Join (cost=856.30..80628.56 rows=907357 width=129)
> (actual time=26.719..1593.693 rows=907357 loops=1)
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome, dim_vendedor.a3_nome
>
> Hash Cond: (fato_ventas_productos.key_vendedor =
> dim_vendedor.key_vendedor)
>
> -> Hash Join (cost=830.02..68126.13 rows=907357 width=86) (actual
> time=25.980..1203.775 rows=907357 loops=1)
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome,
> fato_ventas_productos.key_vendedor
>
> Hash Cond: (fato_ventas_productos.key_cliente =
> dim_cliente.key_cliente)
>
> -> Seq Scan on public.fato_ventas_productos
> (cost=0.00..46880.57 rows=907357 width=16) (actual
> time=0.004..680.283 rows=907357 loops=1)
>
> Output: fato_ventas_productos.key_cliente,
> fato_ventas_productos.key_vendedor
>
> -> Hash (cost=618.90..618.90 rows=16890 width=86) (actual
> time=25.931..25.931 rows=16890 loops=1)
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome, dim_cliente.key_cliente
>
> Buckets: 2048 Batches: 1 Memory Usage: 1980kB
>
> -> Seq Scan on public.dim_cliente (cost=0.00..618.90
> rows=16890 width=86) (actual time=0.005..13.736 rows=16890 loops=1)
>
> Output: dim_cliente.tipocliente, dim_cliente.a1_ibge,
> dim_cliente.a1_cod, dim_cliente.a1_nome, dim_cliente.key_cliente
>
> -> Hash (cost=18.90..18.90 rows=590 width=59) (actual
> time=0.715..0.715 rows=590 loops=1)
>
> Output: dim_vendedor.a3_nome, dim_vendedor.key_vendedor
>
> Buckets: 1024 Batches: 1 Memory Usage: 56kB
>
> -> Seq Scan on public.dim_vendedor (cost=0.00..18.90 rows=590
> width=59) (actual time=0.024..0.405 rows=590 loops=1)
>
> Output: dim_vendedor.a3_nome, dim_vendedor.key_vendedor
>
> Total runtime: 37249.268 ms
>
> (25 filas)
>
>
> ___________________________________________________
>
>
> Is anything that I can do to solve this problem, is that a bug or a
> config problem?
>
>
> Here the link with a dump of the tables
>
> https://drive.google.com/file/d/0Bwupj61i9BtWZ1NiVXltaWc0dnM/view?usp=sharing
>
>
> I appreciate your help
>
Hello!
What is your Postgres version?
Do you have correct statistics on this tables?
Please show yours execution plans with buffers i.e. explain
(analyze,buffers) ...

--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Evgeniy Shishkin 2015-11-25 16:35:15 Re: Query that took a lot of time in Postgresql when not using trim in order by
Previous Message Blas Pico 2015-11-25 14:15:31 Query that took a lot of time in Postgresql when not using trim in order by