Re: Can I do better than this heapscan and sort?

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andy Halsall <halsall_andy(at)hotmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Can I do better than this heapscan and sort?
Date: 2012-06-27 16:49:02
Message-ID: CAHyXU0yS5366FFN9t3b_0Fmbc7uxRGhjMLyV9paO6HKjFdZ=UA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Jun 27, 2012 at 10:40 AM, Andy Halsall <halsall_andy(at)hotmail(dot)com> wrote:
>
>
>> Date: Tue, 26 Jun 2012 08:42:34 -0500
>> Subject: Re: [PERFORM] Can I do better than this heapscan and sort?
>> From: mmoncure(at)gmail(dot)com
>> To: halsall_andy(at)hotmail(dot)com
>> CC: pgsql-performance(at)postgresql(dot)org
>
>>
>> On Tue, Jun 26, 2012 at 8:36 AM, Merlin Moncure <mmoncure(at)gmail(dot)com>
>> wrote:
>> > On Thu, Jun 21, 2012 at 3:07 PM, Andy Halsall <halsall_andy(at)hotmail(dot)com>
>> > wrote:
>> >> I have two tables node and relationship. Each relationship record
>> >> connects
>> >> two nodes and has an application keys (unfortunately named) that can be
>> >> used
>> >> by the application to look-up a relationship and get from one node to
>> >> the
>> >> other.
>> >>
>> >> My query uses a node id and a description of a relationship from the
>> >> node,
>> >> and must find the "next" relationship that the node has. It does this
>> >> by
>> >> finding all the relationships that could be "next", ordering them and
>> >> then
>> >> getting the first.
>> >>
>> >> Details are below but I end up with 6896 candidates for "next".
>> >>
>> >> If I'm reading the output correctly it takes 13.509 ms to apply the
>> >> filter
>> >> and another 7 ms or so to do the sort of the remaining 6896 nodes.
>> >>
>> >> Have tried many index combinations to try and improve the results. I
>> >> suspect
>> >> that with so many nodes to sort, postgresql will opt for heap scan
>> >> rather
>> >> than index. But why does it not use the IDX_order_sort_down_2 index for
>> >> the
>> >> sort?
>> >>
>> >> Thanks,
>> >> Andy
>> >>
>> >>
>> >> Details..........
>> >>
>> >> Version
>> >> -------
>> >> PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
>> >> 4.5.2,
>> >> 64-bit
>> >>
>> >> Tables
>> >> ------
>> >> CREATE TABLE node (
>> >>  node_id      bigint NOT NULL,
>> >>  node_type    int4 NOT NULL,
>> >>  c_state      int4 NOT NULL,
>> >>  d_state      int4 NOT NULL,
>> >>  sort_key     bigint NOT NULL,
>> >>  permissions  bytea NOT NULL,
>> >>  audit        bytea  NOT NULL,
>> >>  pkg_id       bytea NULL,
>> >>  created      timestamp NOT NULL
>> >> );
>> >>
>> >> CREATE TABLE relationship (
>> >>  rel_id           bigint NOT NULL,
>> >>  rel_type         integer NOT NULL,
>> >>  s_t_n            bigint NOT NULL,
>> >>  t_s_n            bigint NOT NULL,
>> >>  state            integer NOT NULL,
>> >>  control          integer NOT NULL,
>> >>  sort_key         bigint NOT NULL,
>> >>  prime_key        bytea NULL,
>> >>  prime_key_len    integer NOT NULL,
>> >>  sec_key          bytea NULL,
>> >>  sec_key_len      integer NOT NULL,
>> >>  up_sort_key      bigint NOT NULL,
>> >>  up_prime_key     bytea NULL,
>> >>  up_prime_key_len integer NOT NULL,
>> >>  up_sec_key       bytea NULL,
>> >>  up_sec_key_len   integer NOT NULL,
>> >>  permissions      bytea NOT NULL,
>> >>  t_s_n_type       integer NOT NULL,
>> >>  created          timestamp NOT NULL
>> >> );
>> >>
>> >> Constraints
>> >> -----------
>> >> -- Primary keys
>> >> ALTER TABLE node ADD CONSTRAINT PK_node PRIMARY KEY (node_id);
>> >>
>> >> ALTER TABLE relationship ADD CONSTRAINT PK_relationship PRIMARY KEY
>> >> (rel_id);
>> >>
>> >> -- Foreign keys
>> >> ALTER TABLE relationship ADD CONSTRAINT FK_node_s FOREIGN KEY (s_t_n)
>> >> REFERENCES node (node_id);
>> >>
>> >> ALTER TABLE relationship ADD CONSTRAINT FK_node_n FOREIGN KEY (t_s_n)
>> >> REFERENCES node (node_id);
>> >>
>> >>
>> >> Indexes
>> >> -------
>> >> CREATE INDEX IDX_node_type ON node (node_type ASC) TABLESPACE
>> >> ds_appex_ts_10
>> >> ;
>> >> CREATE INDEX IDX_node_sort_key ON node (sort_key ASC) TABLESPACE
>> >> ds_appex_ts_10
>> >> ;
>> >> CREATE INDEX IDX_relationship_s_t_n ON relationship (s_t_n ASC)
>> >> TABLESPACE
>> >> ds_appex_ts_10
>> >> ;
>> >> CREATE INDEX IDX_relationship_t_s_n ON relationship (t_s_n ASC)
>> >> TABLESPACE
>> >> ds_appex_ts_10
>> >> ;
>> >> CREATE INDEX IDX_relationship_type ON relationship (rel_type ASC)
>> >> TABLESPACE
>> >> ds_appex_ts_10
>> >> ;
>> >> CREATE INDEX IDX_relationship_prime_key ON relationship (prime_key ASC)
>> >> TABLESPACE ds_appex_ts_10
>> >> ;
>> >> CREATE INDEX IDX_relationship_u_prime_key ON relationship (up_prime_key
>> >> ASC)
>> >> TABLESPACE ds_appex_ts_10
>> >> ;
>> >> CREATE INDEX IDX_relationship_sec_key ON relationship (sec_key ASC)
>> >> TABLESPACE ds_appex_ts_10
>> >> ;
>> >> CREATE INDEX IDX_order_first ON node(sort_key DESC, node_id DESC)
>> >> TABLESPACE
>> >> ds_appex_ts_10
>> >> ;
>> >> CREATE INDEX IDX_order_sort_down_1 ON relationship(sort_key DESC,
>> >> prime_key
>> >> ASC NULLS FIRST, sec_key ASC NULLS FIRST) TABLESPACE ds_appex_ts_10
>> >> ;
>> >> CREATE INDEX IDX_order_sort_down_2 ON relationship(sort_key DESC,
>> >> prime_key
>> >> ASC NULLS FIRST, sec_key DESC NULLS FIRST) TABLESPACE ds_appex_ts_10
>> >> ;
>> >> CREATE INDEX IDX_order_sort_up ON relationship(up_sort_key DESC,
>> >> up_prime_key ASC NULLS FIRST, up_sec_key ASC NULLS FIRST) TABLESPACE
>> >> ds_appex_ts_10
>> >> ;
>> >>
>> >> Query
>> >> -----
>> >> CREATE OR REPLACE FUNCTION sp_get_rel_sort_dup_sec_desc(in_rel_type1
>> >> integer, in_rel_type2 integer, in_node_type integer, in_own_guid
>> >> bigint,
>> >> in_prev_prime_key bytea, in_prev_prime_key_len integer, in_prev_sec_key
>> >> bytea, in_prev_sec_key_len integer, in_prev_sort_key bigint, in_ctrl
>> >> integer) RETURNS select_rel_holder AS
>> >> '
>> >> declare
>> >> h select_rel_holder%rowtype;
>> >>
>> >> begin
>> >>        SELECT INTO h  r.rel_id, r.t_s_n, r.rel_type, r.sort_key,
>> >>                       r.state,r.permissions, r.control,
>> >>                       r.prime_key, r.prime_key_len, r.sec_key,
>> >> r.sec_key_len,
>> >>                       r.up_prime_key, r.up_prime_key_len, r.up_sec_key,
>> >> r.up_sec_key_len
>> >>         FROM relationship r
>> >>         WHERE r.s_t_n = in_own_guid AND (r.rel_type = in_rel_type1 OR
>> >> r.rel_type = in_rel_type2)
>> >>         AND
>> >>         (
>> >>             (
>> >>                (
>> >>                   r.prime_key > in_prev_prime_key
>> >>                   OR
>> >>                   ( r.prime_key = in_prev_prime_key AND r.sec_key <
>> >> in_prev_sec_key)
>> >>                )
>> >>                AND
>> >>                r.sort_key = in_prev_sort_key
>> >>             )
>> >>
>> >>             OR
>> >>             r.sort_key < in_prev_sort_key
>> >>         )
>> >>         AND t_s_n_type = in_node_type
>> >>         AND r.control >= in_ctrl
>> >>         ORDER BY sort_key DESC, prime_key ASC NULLS FIRST, sec_key DESC
>> >> NULLS FIRST LIMIT 1;
>> >>         RETURN h;
>> >> end
>> >> '
>> >> language 'plpgsql' STABLE;
>> >>
>> >>
>> >> EXPLAIN ANALYZE output
>> >> -------------------------------
>> >>         Limit  (cost=48.90..48.90 rows=1 width=89) (actual
>> >> time=21.480..21.480 rows=1 loops=1)
>> >>           Output: rel_id, t_s_n, rel_type, sort_key, state,
>> >> permissions,
>> >> control, prime_key, prime_key_len, sec_key, sec_key_len, up_prime_key,
>> >> up_prime_key_l
>> >> en, up_sec_key, up_sec_key_len
>> >>
>> >>           ->  Sort  (cost=48.90..48.90 rows=1 width=89) (actual
>> >> time=21.479..21.479 rows=1 loops=1)
>> >>                 Output: rel_id, t_s_n, rel_type, sort_key, state,
>> >> permissions, control, prime_key, prime_key_len, sec_key, sec_key_len,
>> >> up_prime_key, up_prime
>> >> _key_len, up_sec_key, up_sec_key_len
>> >>                 Sort Key: r.sort_key, r.prime_key, r.sec_key
>> >>                 Sort Method: top-N heapsort  Memory: 25kB
>> >>
>> >>                 ->  Bitmap Heap Scan on public.relationship r
>> >> (cost=3.39..48.89 rows=1 width=89) (actual time=1.034..13.509 rows=6986
>> >> loops=1)
>> >>                       Output: rel_id, t_s_n, rel_type, sort_key, state,
>> >> permissions, control, prime_key, prime_key_len, sec_key, sec_key_len,
>> >> up_prime_key, up
>> >> _prime_key_len, up_sec_key, up_sec_key_len
>> >>                       Recheck Cond: (r.s_t_n = $4)
>> >>                       Filter: ((r.control >= $10) AND (r.t_s_n_type =
>> >> $3)
>> >> AND ((r.rel_type = $1) OR (r.rel_type = $2)) AND ((((r.prime_key > $5)
>> >> OR
>> >> ((r.prime_
>> >> key = $5) AND (r.sec_key < $7))) AND (r.sort_key = $9)) OR (r.sort_key
>> >> <
>> >> $9)))
>> >>
>> >>                       ->  Bitmap Index Scan on idx_relationship_s_t_n
>> >> (cost=0.00..3.39 rows=18 width=0) (actual time=0.951..0.951 rows=6989
>> >> loops=1)
>> >>                             Index Cond: (r.s_t_n = $4)
>> >
>> > Absolutely.  You need to learn and master row-wise comparison.  It was
>> > added for exactly this purpose :-).
>> >
>> > SELECT * FROM foo WHERE (a,b,c) > (a1,b1,c1) ORDER BY a,b,c LIMIT k;
>> >
>> > will be fully optimized if you have an index on a,b,c (a1,b1,c1 are
>> > the last ones you read off).  Be advised that if there is not a lot of
>> > cardinality on 'a', you may need to disable certain index plans to get
>> > a good plan in some cases.
>>
>> hm, one more point: I notice you are mixing ASC/DESC in the index
>> definition. Try to avoid doing that: it will make index based paging
>> of the table more difficult. If you have to, try transforming the
>> values so that you can index all the fields ASC or DESC. This will
>> also fit easier into row-wise comparisons strategy although it will
>> slow down insertion a bit.
>>
>> merlin
>
> Thanks Merlin. I wasn't aware of the row-wise comparison stuff. It's tricky
> in the case above though as I want to do (a > b) OR (a=b AND b < c).
>
> As it happens, the application needs to iterate the set of "next"
> relationships, which as I had it meant establishing a new set on each
> iteration (where new set = old set minus its first entry). This is what the
> filter was doing in the above query. This is too expensive (as ~7000
> iterations) and so have modified solution to cache the ordered set when
> first established.
>
> Even without the filter I still need to order the return as above. Still
> don't understand why it doesn't use the index for this.

postgres isn't smart enough to convert the complex boolean expressions
for multiple field set ordering into a multi-column index lookup
unless the row-wise comparison syntax is used.

where (a,b) > (a1,b1)
can be written as boolean:
where (a>a1) OR (a=a1 AND b>b1)
alternate form is:
where (a>=a1) AND (a>a1 OR b>b1)

where (a,b,c) > (a1,b1,c1)
can be written as boolean:
where (a>a1) OR (a=a1 AND b>b1) OR (a = b1 AND b1 AND c>c1)
alternate form is:
where (a>=a1) AND (a>a1 OR b>=b1) AND (a>a1 OR b>b1 OR c>=c1)

in either case using boolean construction postgres will only use an
index on 'a' if it exists, but will fully optimize the row-wise case
if it matches the index. therefore, making your query faster using
row-wise is going to be an exercise of making an index matching your
set browsing that contains either all ASC or all DESC -- you can't mix
and match. in order to do that some transformation of your values is
in order -- either directly on the table itself or with functional
indexes doing the transformation during index build. For integers we
can do that by multiplying by -1. other datatypes might have more
complicated constructions but it's usually doable.

all this is assuming btw that you can logically order your table with
a consistent ordering and you are trying to index lookups between some
two points withing that ordering (perhaps sliding that windows as you
crawl the table)

merlin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Cédric Villemain 2012-06-28 10:22:35 Re: Performance of a large array access by position (tested version 9.1.3)
Previous Message Willy-Bas Loos 2012-06-27 13:23:58 Re: [PERFORM] [performance] fast reads on a busy server