Re: Query taking long time

From: Venkata Balaji Nagothi <vbnpgc(at)gmail(dot)com>
To: acanada <acanada(at)cnio(dot)es>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query taking long time
Date: 2014-03-03 23:28:22
Message-ID: CAHBAh5s8_Q9hnHbNgEhukS1qGuz-Gzc0NZgKfj3B0vZYx6cUWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Mar 3, 2014 at 9:17 PM, acanada <acanada(at)cnio(dot)es> wrote:

> Hello,
>
> Thankyou for your answer.
> I have made more changes than a simple re-indexing recently. I have moved
> the sorting field to the table in order to avoid the join clause. Now the
> schema is very simple. The query only implies one table:
>
> x=> \d+ entity_compounddict2document;
> Table "public.entity_compounddict2document"
> Column | Type | Modifiers | Storage
> | Description
>
> ------------------+--------------------------------+-----------+----------+-------------
> id | integer | not null | plain
> |
> document_id | integer | | plain
> |
> name | character varying(255) | | extended
> |
> qualifier | character varying(255) | | extended
> |
> tagMethod | character varying(255) | | extended
> |
> created | timestamp(0) without time zone | | plain
> |
> updated | timestamp(0) without time zone | | plain
> |
> curation | integer | | plain
> |
> hepval | double precision | | plain
> |
> cardval | double precision | | plain
> |
> nephval | double precision | | plain
> |
> phosval | double precision | | plain
> |
> patternCount | double precision | | plain
> |
> ruleScore | double precision | | plain
> |
> hepTermNormScore | double precision | | plain
> |
> hepTermVarScore | double precision | | plain
> |
> Indexes:
> "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
> "entity_compound2document_cardval" btree (cardval)
> "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
> "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
> "entity_compound2document_hepval" btree (hepval)
> "entity_compound2document_name" btree (name)
> "entity_compound2document_nephval" btree (nephval)
> "entity_compound2document_patterncount" btree ("patternCount")
> "entity_compound2document_phosval" btree (phosval)
> "entity_compound2document_rulescore" btree ("ruleScore")
> Has OIDs: no
>
> tablename | indexname
> | num_rows | table_size | index_size
> | unique | number_of_scans | tuples_read | tuples_fetched
> entity_compounddict2document | entity_compound2document_cardval
> | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 |
> 0 | 0
> entity_compounddict2document |
> entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB |
> 1162 MB | Y | 0 | 0 | 0
> entity_compounddict2document | entity_compound2document_heptermvarscore
> | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 |
> 0 | 0
> entity_compounddict2document | entity_compound2document_hepval
> | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 |
> 0 | 0
> entity_compounddict2document | entity_compound2document_name
> | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 |
> 178680 | 0
> entity_compounddict2document | entity_compound2document_nephval
> | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 |
> 0 | 0
> entity_compounddict2document | entity_compound2document_patterncount
> | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 |
> 0 | 0
> entity_compounddict2document | entity_compound2document_phosval
> | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 |
> 0 | 0
> entity_compounddict2document | entity_compound2document_rulescore
> | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 |
> 0 | 0
> entity_compounddict2document | entity_compounddict2document_pkey
> | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 |
> 0 | 0
>
> The table has aprox. 54,000,000 rows
> There are no NULLs in hepval field and pg_settings haven't changed. I also
> have done "analyze" to this table.
>
> I have simplified the query and added the last advise that you told me:
>
> Query:
>
> explain analyze select * from (select * from entity_compounddict2document
> where name='ranitidine') as a order by a.hepval;
>
> QUERY PLAN
>
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------
> Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual
> time=32715.097..32716.488 rows=13512 loops=1)
> Sort Key: entity_compounddict2document.hepval
> Sort Method: quicksort Memory: 2301kB
> -> Bitmap Heap Scan on entity_compounddict2document
> (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483
> rows=13512 loops=1)
> Recheck Cond: ((name)::text = 'ranitidine'::text)
> -> Bitmap Index Scan on entity_compound2document_name
> (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512
> loops=1)
> Index Cond: ((name)::text = 'ranitidine'::text)
> Total runtime: 32717.548 ms
>
> Another query:
> explain analyze select * from (select * from
> entity_compounddict2document where name='progesterone' ) as a order by
> a.hepval;
>
> QUERY PLAN
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------
> Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual
> time=9262.887..9287.046 rows=138165 loops=1)
> Sort Key: entity_compounddict2document.hepval
> Sort Method: quicksort Memory: 25622kB
> -> Bitmap Heap Scan on entity_compounddict2document
> (cost=2906.93..356652.81 rows=131997 width=133) (actual
> time=76.316..9038.485 rows=138165 loops=1)
> Recheck Cond: ((name)::text = 'progesterone'::text)
> -> Bitmap Index Scan on entity_compound2document_name
> (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913
> rows=138165 loops=1)
> Index Cond: ((name)::text = 'progesterone'::text)
> Total runtime: 9296.815 ms
>
>
> It has improved (I supose because of the lack of the join table) but still
> taking a lot of time... Anything I can do??
>
> Any help would be very appreciated. Thank you very much.
>

Good to know performance has increased.

"entity_compounddict2document" table goes through high INSERTS ?

Can you help us know if the "helpval" column and "name" column have high
duplicate values ? "n_distinct" value from pg_stats table would have that
info.

Below could be a possible workaround -

As mentioned earlier in this email, a composite Index on name and hepval
column might help. If the table does not go through lot of INSERTS, then
consider performing a CLUSTER on the table using the same INDEX.

Other recommendations -

Please drop off all the Non-primary key Indexes which have 0 scans / hits.
This would harm the DB and the DB server whilst maintenance and DML
operations.

Regards,
Venkata Balaji N

Fujitsu Australia

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Damon Snyder 2014-03-04 01:12:16 Re: Help with optimizing a query over hierarchical data
Previous Message Claudio Freire 2014-03-03 21:52:06 Re: Help with optimizing a query over hierarchical data